cosovo
    Preparing search index...

    cosovo

    cosovo

    npm minzipped workflow status mit license coverage dependencies

    Fetch and parse ranges of CSV file.

    npm install cosovo
    

    Parse a remote CSV file from a URL:

    import { parseURL } from 'cosovo'
    const url = 'https://data.source.coop/severo/csv-papaparse-test-files/sample.csv'
    const rows = []
    for await (const { row } of parseURL(url)) {
    rows.push(row)
    }
    console.log(rows)
    // Output: [ [ 'A', 'B', 'C' ], [ 'X', 'Y', 'Z' ] ]

    The parseURL function yields an object for each row with the following properties:

    • row: array of strings with the values of the row.
    • errors: array of parsing errors found in the row.
    • meta: object with metadata about the parsing process.

    The format is described on the doc pages: https://severo.github.io/cosovo/interfaces/ParseResult.html.

    The row field might contain fewer or more columns than expected, depending on the CSV content. It can be an empty array for empty rows. It's up to the user to handle these cases. The library does not trim whitespace from values, and it does not convert types.

    The errors field contains any parsing errors found in the row. It's an array of error messages, which can be useful for debugging.

    The meta field provides the delimiter and newline strings, detected automatically, or specified by the user. It also gives the number of characters of the line (as counted by JavaScript) and the corresponding number of bytes in the original CSV file (which may differ due to multi-byte characters) and byte offset in the file. These counts include the newline characters.

    The parseURL function accepts an optional second argument with options.

    It can contain options for fetching the CSV file, for guessing the delimiter and newline characters, and for parsing the CSV content.

    Find some examples of usage below. You can also find them in the examples directory, and run them with npm run examples.

    As the library uses async iterators, it's easy to stop parsing after a certain number of rows:

    import { parseURL } from 'cosovo'
    const url = 'https://data.source.coop/severo/csv-papaparse-test-files/verylong-sample.csv'
    const rows = []
    let count = 0
    for await (const { row } of parseURL(url)) {
    rows.push(row)
    count++
    if (count >= 10) {
    break
    }
    }
    console.log(rows)

    You can fetch only a specific byte range of the CSV file, to parse only a part of it. This is useful for large files.

    import { parseURL } from 'cosovo'
    const url = 'https://data.source.coop/severo/csv-papaparse-test-files/verylong-sample.csv'
    const fetchOptions = {
    firstByte: 30_000,
    lastByte: 30_200
    }
    const rows = []
    for await (const { row } of parseURL(url, { fetch: fetchOptions })) {
    rows.push(row)
    }
    console.log(rows)

    Use the result.meta.byteOffset and result.meta.byteCount fields to know the exact byte range of each parsed row, and adjust your fetching strategy accordingly. See the examples for an in-depth look.

    You can also parse a CSV string directly with the parseString function:

    import { parseText } from 'cosovo'
    const csvString = 'A,B,C\nX,Y,Z'
    const rows = []
    for await (const { row } of parseText(csvString)) {
    rows.push(row)
    }
    console.log(rows)

    Note that parseText provide a synchronous iterator, so you don't need to use await in the for loop.

    This is an early version:

    • until 1.0.0, breaking changes will be introduced only in minor versions.
    • from version 1.0.0, breaking changes will be introduced only in major versions.

    This library is used by source.coop to preview the CSV files. More info in csv-table, which fetches ranges of the remote CSV to display the rows that are visible in the table. It also caches the fetched ranges to avoid re-fetching them when scrolling.

    The code is heavily inspired by Papaparse.

    It has partly been funded by source.coop.