Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Efficient Distributed Decision Trees for Robust Regression
 
conference paper

Efficient Distributed Decision Trees for Robust Regression

Guo, Tian  
•
Kutzkov, Konstantin
•
Ahmed, Mohammed
Show more
2016
ECML PKDD 2016: Machine Learning and Knowledge Discovery in Databases
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD)

The availability of massive volumes of data and recent advances in data collection and processing platforms have motivated the development of distributed machine learning algorithms. In numerous real-world applications large datasets are inevitably noisy and contain outliers. These outliers can dramatically degrade the performance of standard machine learning approaches such as regression trees. To this end, we present a novel distributed regression tree approach that utilizes robust regression statistics, statistics that are more robust to outliers, for handling large and noisy data. We propose to integrate robust statistics based error criteria into the regression tree. A data summarization method is developed and used to improve the efficiency of learning regression trees in the distributed setting. We implemented the proposed approach and baselines based on Apache Spark, a popular distributed data processing platform. Extensive experiments on both synthetic and real datasets verify the effectiveness and efficiency of our approach.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

main.pdf

Access type

openaccess

Size

2.2 MB

Format

Adobe PDF

Checksum (MD5)

d79723d1b75f817316d71fc6eb003fa0

Loading...
Thumbnail Image
Name

prel_report_network_dataset.pdf

Access type

openaccess

Size

543.23 KB

Format

Adobe PDF

Checksum (MD5)

1af242b8917b2e87bff291a9ba6b3f9a

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés