Skip to content

UTF-8 CSV files with BOM aren't parsed correctly if the first header field contains quotes #37

@datatraveller1

Description

@datatraveller1

I have a CSV file encoded with UTF8-BOM:

"first_column","second_column"
"Hello","how are you"

This is a correct CSV file but there is the result:

Record #0 has error: bare " in non-quoted-field

The issue happens with an UTF-8 with BOM encoded file if the first header field is surrounded by quotes.

Suggestion: This could be solved by removing the UTF-8 BOM in the header line:
Pseudocode:
if (line_number == 1) { sub(/^\xef\xbb\xbf/, "", line) }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions