Anchor Data Augmentation (ADA): A Domain-Agnostic Method for Enhancing Regression Models

DM Television

3AC’s liquidators to sell NFTs to recover assets

November

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Anchor Data Augmentation (ADA): A Domain-Agnostic Method for Enhancing Regression Models

Content Distribution

Tags: distribution

Author: DATE POSTED:November 14, 2024

Feed: Hacker Noon - Medium

View: Original article

:::info Authors:

(1) Nora Schneider, Computer Science Department, ETH Zurich, Zurich, Switzerland ([email protected]);

(2) Shirin Goshtasbpour, Computer Science Department, ETH Zurich, Zurich, Switzerland and Swiss Data Science Center, Zurich, Switzerland ([email protected]);

(3) Fernando Perez-Cruz, Computer Science Department, ETH Zurich, Zurich, Switzerland and Swiss Data Science Center, Zurich, Switzerland ([email protected]).

:::

Table of Links

Abstract and 1 Introduction

2 Background

2.1 Data Augmentation

2.2 Anchor Regression

3 Anchor Data Augmentation

3.1 Comparison to C-Mixup and 3.2 Preserving nonlinear data structure

3.3 Algorithm

4 Experiments and 4.1 Linear synthetic data

4.2 Housing nonlinear regression

4.3 In-distribution Generalization

4.4 Out-of-distribution Robustness

5 Conclusion, Broader Impact, and References

\ A Additional information for Anchor Data Augmentation

B Experiments

3 Anchor Data Augmentation

In this section, we introduce Anchor Data Augmentation (ADA), a domain-independent data augmentation method inspired by AR. ADA does not require previous knowledge about the data invariances nor manually engineered transformations. As opposed to existing domain-agnostic data augmentation methods [10, 45, 46], we do not require training of an expensive generative model, and the augmentation only adds marginally to the computation complexity of the training. In addition, since ADA originates from a causal regression problem, it can be readily applied to regression problems. Even when ADA does not improve performance, its effect on performance remains minimal.

\ Comparison of ADA augmentations on a nonlinear Cosine data model. For a larger partition size, ADA augmentations are more accurate due to the high local variability of the Cosine function. We used k-means clustering to construct A and γ ∈ {1/2, 2/3, 1.03/2, 2.0}.

:::info This paper is available on arxiv under CC0 1.0 DEED license.

:::

Feed: Hacker Noon - Medium

View: Original article

Tags: distribution

Content Distribution