Sen(Qian)’s Memo

This website is Donglin Qian (Torin Sen)’s memo, especially about machine learning papers and competitive programming.

3/14

2024-10-04

2022-NIPS-Positive-Unlabeled Learning using Random Forests via Recursive Greedy Risk Minimization

決定木は2つの子グループに分けるとき、Entropyやジニ係数が一番下がる特徴量&それの閾値で分けていた。Entropyやジニ係数について、実は損失関数として組み込んだObjectiveの式にまとめ直すことができる。ということで、先行研究のuPUやnnPUの式をObjectiveとして使って、決定木の決断に使う。そのうえでRandom Forestの時は、通常各木で使う特徴量やデータもランダムに選んだ部分集合にするが、今はさらにその閾値すらランダムに決定する、Extra Treesという手法を使う。

→Read more

2024-10-03

PU Paper Bias SAR SCAR EM-Algorithm Single-Training-Set Detail Article Noisy-Label

2021-TPAMI-[LBE]Instance-Dependent Positive and Unlabeled Learning With Labeling Bias Estimation

グラフィカルモデルによって、ground truthのyiは隠れ変数であり、ラベルがついているかsiとインスタンスxiは明示されている変数である。モデルとして、多層パーセプトロンかロジスティック回帰を使っている。これで、グラフィカルモデルに従って必要なp(yi|xi)やp(si|xi, yi)を定義する。学習自体はEMアルゴリズムで行っている(変分推論ではない)

→Read more

2024-10-02

Paper PU Negative Assumption Class Prior Case-Control Detail Article Best Bin Estimation Pseudo Label

2021-NIPS-[TEDn]Mixture Proportion Estimation and PU Learning: A Modern Approach

Class Priorの推定は、BBEという手法を用いる。Uの中の閾値を超える割合/Pの中の閾値を超える割合の値が最小になるとき、その値がClass Priorだという。学習については、Warm-up(普通に雑にPN Learning)しつつ、Uのl(f(x),-1)についての損失が少ない1-π割にNegativeというPseudo Labelを付与して、重みπを考慮したPN Learningで学習していく。これはSelf-supervised Learningベースのもの。 Class Priorの推定とSelf-supervised Learningを交互にやっていく感じ。

→Read more

2024-10-01

Paper Detail Article Optimal transport

最適輸送について

→Read more

2024-09-30

Paper PU Optimal transport Wasserstein Distance Noisy-Label

2020-NIPS-Partial Optimal Transport with Applications on Positive-Unlabeled Learning

Denoisingにも使えそう。

→Read more

2024-09-25

EM-Algorithm ELBO Detail Article Generative Variational

ELBOとEM Algorithmについて

これについての自分の勉強メモ

→Read more

2024-09-25

Paper PU SCAR SAR Generative VAE Gaussian Model

2020-CIKM-[VAE-PU]Deep Generative Positive-Unlabeled Learning under Selection Bias

PにバイアスがあるSAR仮定における解決法として、生成モデルで本来のPに含まれるが与えられたデータには含まれてないPを生成させて、それをまとめて、SAR仮定のPU Learningの式で学ばせるというもの。生成ではVAEをベースに、Discriminatorとの敵対的訓練を使っている。

→Read more

2024-09-21

Paper PU Mix-up Negative Assumption Case-Control Square Loss

2020-onlyarxiv-[MixPUL] Consistency-based Augmentation for Positive and Unlabeled Learning

→Read more

2024-09-20

Paper PU Mix-up Negative Assumption Data Augmentation

2022-ICLR-[P3Mix]Who Is Your Right Mixup Partner in Positive and Unlabeled Learning

PUのcost-sensitiveのnegative assumptionを防ぐため、識別境界に近い(これは識別確率が[0.5-x, 0.5+x]に含まれているって感じの)データをmixupして学習させるという手法。面白いのは、cost sensitiveなのにPUの項の重みは固定ではなくハイパラにしているところ。

→Read more

2024-09-20

Paper PU Noisy-Label Entropy Regularization Pseudo Label Moving Average Negative Assumption

2020-onlyarxiv-A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels

Pは普通にLabeledデータとして損失を扱う。 Uについては、Pseudo LabelとのKL Divergenceを損失にする。そしてさらに、Uにおいて、すべてのcalibrationされた後の予測値の平均はclass priorと同じ値でありたい。そして、明示的にすべてのUデータに対して、予測値がclass priorになってしまうのを防ぎたいので、Entropy Minimizationを入れている。 Pseudo Labelは過去数エポックのモデル出力の移動平均とする。

→Read more