No items found.

Research Papers

Read research papers written by the Hopsworks team.

November 19, 2020

Alexandru A. Ormenisan, Moritz Meister, Fabio Buso, Robin Andersson, Seif Haridi, Jim Dowling

Implicit model for provenance can be used next to a feature store with versioned data to build reproducible and more easily debugged ML pipelines. We provide development tools and visualization support that can help developers more easily navigate and re-run pipelines .

November 19, 2020

Mahmoud Ismail, Salman Niazi, Gautier Berthou, Mikael Ronström, Seif Haridi, Jim Dowling

HopsFS-S3 is a hybrid cloud-native distributed hierarchical file system that is available across availability zones, has the same cost as S3, but has 100X the performance of S3 for file move/rename operations, and 3.4X the read throughput of S3 (EMRFS) for the DFSIO Benchmark.

November 19, 2020

Mahmoud Ismail, Salman Niazi, Mauritz Sundell, Mikael Ronstrom, Seif Haridi, and Jim Dowling

HopsFS-CL is a highly available distributed hierarchical file system with native support for AZ awareness using synchronous replication protocols.

November 19, 2020

Moritz Meister, Sina Sheikholeslami, Amir H. Payberah, Vladimir Vlassov, Jim Dowling

Maggy is an extension to Spark’s synchronous processing model to allow it to run asynchronous ML trials, enabling end-to-end state-of-the-art ML pipelines to be run fully on Spark. Maggy provides programming support for defining, optimizing, and running parallel ML trials.

February 24, 2020

Moritz Meister, Sina Sheikholeslami, Robin Andersson, Alexandru A. Ormenisan, Jim Dowling

The distribution oblivious training function allows ML developers to reuse the same training function when running a single host Jupyter notebook or performing scale-out hyperparameter search and distributed training on clusters.

February 24, 2020

Alexandru A. Ormenisan, Mahmoud Ismail, Seif Haridi, Jim Dowling

Implicit provenance allows us to capture full lineage for ML programs, by only instrumenting the distributed file system and APIs and with no changes to the ML code.

July 5, 2019

Mahmoud Ismail, August Bonds, Salman Niazi, Seif Haridi, Jim Dowling.

New version of block reporting protocol for HopsFS that uses up to 1/1000th of the resources of HDFS' block reporting protocol. IEEE BigDataCongress’19.

© Hopsworks 2024. All rights reserved. Various trademarks held by their respective owners.

Privacy Policy
Cookie Policy
Terms and Conditions