CouchFS: A High-Performance File System for Large Data Sets
Numerous file systems have been implemented to meet the needs in today’s big data era, however many of them require specific configurations or frameworks for data processing. This paper presents CouchFS, a POSIX-compliant distributed file system for large data sets. We build CouchFS on top of CouchDB, which grants us flexibility to handle semistructured data. Since a database has similar behaviors as a file system, and CouchDB provides a high customizable MapReduce view for indexing, CouchFS is able to achieve high-performance searching for both text and supported binary objects. This work compares search of Wikipedia data using CouchDB, PostgreSQL and Spotlight on HFS+ file system. We show our design of CouchFS and discuss future approaches to improve this file system.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
- The following copyright notice applies to all of the above items that appear in IEEE publications: "Personal use of this material is permitted. However, permission to reprint/publish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from IEEE."
- The following copyright notice applies to all of the above items that appear in ACM publications: "© ACM, effective the year of publication shown in the bibliographic information. This file is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in the journal or proceedings indicated in the bibliographic data for each item."
- The following copyright notice applies to all of the above items that appear in IFAC publications: "Document is being reproduced under permission of the Copyright Holder. Use or reproduction of the Document is for informational or personal use only."