Sen(Qian)’s Memo

This website is Donglin Qian (Torin Sen)’s memo, especially about machine learning papers and competitive programming.

Cost-Sensitive

2024-10-25

NLP LLM PU Cost-Sensitive Paper Text Detection

2024-ICLR-Multiscale Positive-Unlabeled Detection of AI-Generated Texts

LLM生成の文章かどうかの識別は短文においては非常に難しい。そもそも短文は人間もLLM生成の文章も似ているので、いっそのことUnlabeldとして認識して、nnPUの枠組みを導入する。この時、「class priorにあたる値は、文章の長さにのみ依存する」という前提のもので(ハイパラをいじりながら)実験した結果、先行研究を上回った。

→Read more

2024-09-18

Paper Negative Assumption PU Cost-Sensitive Case-Control MAE Mix-up Entropy Regularization

2022-CVPR-[Dist-PU] Positive-Unlabeled Learning from a Label Distribution Perspective

ラベルの予測確率について、Pでは平均が1、Uでは平均がclass priorにしたい。 1. 自明な解としてUのすべての確率がclass priorになること。これを防ぐためUの分布にEntropy Minimizationも入れる。 2. それだけでは過学習するので、mix-upを導入する。mix-upしたデータに対してもEntropy Minimizationも行う。

→Read more

2024-08-24

Bias PNU Noisy-Label PU Propensity-Score Sample-Selection Cost-Sensitive Paper

2019-ICML-[PUbN] Classification from Positive, Unlabeled and Biased Negative Data

→Read more

2024-05-23

PU Cost-Sensitive SAR Bias Case-Control Paper

2019-ICLR-[PUSB]Learning from Positive and Unlabeled Data with a Selection Bias

→Read more

2024-05-21

PU Cost-Sensitive Case-Control Gradient Ascent Paper

2017-NIPS-[nnPU] Positive-Unlabeled Learning with Non-Negative Risk Estimator

PUの訓練の式で経験損失がというか一定値以下にならないようにclipするといい感じ。実用的には、一定値以下となった時、損失関数全体が負となった原因の項(本文参照)を取り出し、そのgradientでgradient ascentすることで過学習を防いでいる。

→Read more

2024-05-15

PU Cost-Sensitive Semi-supervised Learning PNU 一旦放置 Paper

2017-ICML-[PNU]Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data

まず、PU+NU学習での損失関数の統合、そしてPNU学習を提案した。

→Read more

2024-05-10

PU Cost-Sensitive Density Estimation 一旦放置 Outlier Detection Paper

2015-ICML-[uPU] Convex Formulation for Learning from Positive and Unlabeled Data

PU学習で2014-RampはR_Xの書き換えをするとき、余事象を使うことで損失項の和が定数にできるというテクを使った。ここでは、余事象を使って書き換えずに普通に代入したとき、損失関数の差が線形関数(-zにしている)になるとしても、いい性質があると提案した。後は外れ値検出のDiscussionがあるが難しくてわからなかった。

→Read more

2024-05-10

PU Cost-Sensitive Class Prior Paper

2014-NIPS-[Ramp]Analysis of Learning from Positive and Unlabeled Data

PU学習について、式変形すれば重み付きの既存のPositiveとNegativeの学習に帰着できる。損失はPNならHingeが普通だが、PUの場合目的関数から損失項を減らせるRamp損失のほうがいい。そのうえ、Class Priorの間違った推定が与える影響も説明されている。理論的には、PUはPNの性能とたかだか2√2倍までしか悪くならない。

→Read more

2023-12-20

PU PNU Cost-Sensitive Single-Training-Set Case-Control Resampling Survey Paper

2020-Survey-Learning from positive and unlabeled data: a survey

→Read more