Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Parquet File

Let’s put everything together and read a complete parquet file.

Task

Implement the read_parquet function in src/reader.rs.

pub fn read_parquet(file_path: impl AsRef<Path>) -> Result<DataFrame> {
    todo!("step08: implement read parquet")
}

Test

Test case for this step is step08_parquet_file.

Hints and Solution

Hint (steps to parse a parquet file)
  • Read the file into Vec<u8> and convert it to Bytes
  • Verify the magic number
  • Read the file metadata
  • Read the row groups
Solution
pub fn read_parquet(file_path: impl AsRef<Path>) -> Result<DataFrame> {
    let mut file = File::open(file_path)?;
    let mut buf = Vec::new();
    file.read_to_end(&mut buf)?;

    let data = Bytes::from(buf);
    ensure_header_footer_magic(data.clone())?;
    let file_metadata = read_file_metadata(data.clone())?;
    let df = read_row_groups(data, &file_metadata)?;
    Ok(df)
}