Skip to main content

CSV Connector

A Comma Separated Values (CSV) file is a simple text file that stores data in a tabular format, with each line (row) representing a record and each field (column) within that record separated by commas. CSV files are commonly used to store data, such as assets and inventory, that can be easily exported and imported into other programs, such as spreadsheets, databases, and data visualization tools.

Brinqa offers a CSV Connector to facilitate the ingestion of data from a CSV file. Assuming that your data is stored in a proprietary format that the existing connectors cannot accommodate, you have the option of exporting your data to CSV files, and then leveraging the CSV Connector to ingest your data into the Brinqa Platform for further processing. This integration empowers you to build a unified view of your attack surface and strengthen your cybersecurity posture.

This document details the information you must provide for the connector to assess your file and retrieve data. See create a data integration for step-by-step instructions on setting up the integration.

Connection settings

When setting up a data integration, select CSV Connector from the Connector drop-down. If you cannot find the connector in the drop-down, make sure that you have installed it first. You must provide the following information:

  • Server: To establish a secure tunnel between the Brinqa Platform and your machine, where the CSV file resides, you may need to install a Brinqa Agent on the machine and create a data server for it. If this is required, select the data server that you have created.

  • Data directory: Enter the fully qualified path to the data directory where the CSV files reside. Wildcards (*) are allowed in the path, e.g., /feed/vendors/vendor_*.csv or /feed/mobile/mobile_device*.*.

    note

    When multiple CSV files are targeted using a wildcard, the CSV Connector processes them based on their last modified time, from oldest to newest. This ensures that the most recent data takes precedence when there are duplicates.

  • Target: Specify the unified data model (UDM) to map your data to. The default selection is Record.

    For example, if your CSV file contains inventory data, you can select Host or any other data model that extends Asset. If you keep the default, Brinqa Platform creates the source data model (SDM) but does not map it to any UDM.

  • Identifier fields: (Optional) Using a comma-separated list, specify the fields that can be used as identifiers. To ensure that there are no duplicates, it is crucial to declare the fields in the correct order.

  • Unique fields: (Optional) Enter a comma-separated list of fields that, when combined together, represent a unique row. This is useful when your CSV file contains duplicates.

  • Numeric fields: (Optional) Enter a comma-separated list of fields that contain numbers. The format is field=type where field is the name of the field and type is the type of numbers, such as integer, double, or long.

  • Boolean fields: (Optional) Enter a comma-separated list of fields that contain boolean values. Example usage: Active, Internet Facing.

  • Date fields: (Optional) Enter a comma-separated list of fields that contain dates. The format is field=date format where field is the name of the field and date format is the format used in the date field.

    For example, if you have a field named First found and it uses the MM/dd/yyyy format; and another field named Last Updated and it uses the MM-dd-yyyy format, you should enter First found=MM/dd/yyyy,Last Updated=MM-dd-yyyy.

    The CSV Connector also supports date fields in Epoch time formats, both in milliseconds and seconds.

    • For milliseconds, use the format: epochMillis.

    • For seconds, use the format: epochSeconds.

    Here are some examples:

    Milliseconds:

    • epochMillis: 1609459200000: Represents 01/01/2021 @ 12:00 AM (UTC) in milliseconds.

    • epochMillis: 1609455600000: Represents 12/31/2020 @ 11:00 PM (UTC) in milliseconds.

    Seconds:

    • epochSeconds: 1609459200: Represents 01/01/2021 @ 12:00 AM (UTC) in seconds.

    • epochSeconds: 1609455600: Represents 12/31/2020 @ 11:00 PM (UTC) in seconds.

  • Multi-value fields: (Optional) Enter a comma-separated list of fields containing multiple values. The format is field=delimiter where field is the name of the field and delimiter is the delimiter used to separate the values.

    For example, if you have a field named Ratings and it contains "High: Medium: Low" as values; and another field named Compliance and it contains "FedRAMP; HIPAA; PCI" as values, you should enter Ratings=:,Compliance=;.

  • Multi-row fields: (Optional) Enter a comma-separated list of fields that contain values spanning across multiple rows.

    In the following example, the IP address for host "h.brinqa.net" is specified across two rows:

    Hostname, IP address
    h.brinqa.net, 10.0.0.0
    h.brinqa.net, 192.168.1.1

    After the data has been processed in the Brinqa Platform, the hostname attribute displays "h.brinqa.net" and the ipAddress attribute contains ["10.0.0.0", "192.168.1.1"]. To prevent any duplicates in the ipAddress attribute, you must ensure that the IP address field contains unique values.

  • File encoding: Specify the encoding of the CSV file. The default encoding is UTF-8.

  • Text qualifier: Specify the qualifier that determines the start and end of a field. The default qualifier is the double quote (").

  • Field delimiter: Specify the delimiter for the fields. The default delimiter is comma (,).

  • EOL characters: Specify the character that represents the end of the line (EOL). The default character is CRLF.

  • Failure threshold: Specify the acceptable number of failed rows before processing of the CSV file is halted. A value of -1 indicates that an infinite number of failures are acceptable. When the threshold is reached, the CSV Connector halts the ingestion process, allowing you to address any issues before resuming.

  • Max age: Specify the maximum number of days to keep the CSV file in the Brinqa Platform. Any value less than 0 indicates that the file does not expire, and 0 indicates not to keep the file. The default is -1.

  • Max files: Specify the maximum number of CSV files to keep in the data directory after they have been processed. Any value less than 0 indicates that there is no limit on the number of files to retain, and 0 indicates not to keep any file. The default is -1.

  • Rename or move the file after it's processed: Select this option to rename or move the file after it has been imported into the Brinqa Platform.

    tip

    If you enable this option, after a CSV file has been ingested, the connector renames the file by appending .processed to the file name. This ensures that the same file won't be ingested multiple times in subsequent sync operations.

Types of data to retrieve

The CSV Connector retrieves records from CSV files and maps them to the data model you specify in the Target field.

info

The CSV Connector does not currently support operation options for the types of data it retrieves.

For detailed steps on how to view the data retrieved from the CSV Connector in the Brinqa Platform, see How to view your data.

APIs

As the CSV Connector is file-based, it doesn't rely on any API endpoints and thus, doesn't offer any operation options.

Changelog

The CSV Connector has undergone the following changes:

3.0.8

  • Updated dependencies.

3.0.6

  • Added support for Epoch seconds and milliseconds.

3.0.5

  • Fixed an issue where importing the same CSV file twice resulted in duplicated records.

3.0.4

  • Stored the value 'false' in a Boolean field if the retrieved value is anything other than 'true'.

3.0.3

  • Introduced configuration on a Boolean field to effectively handle boolean types.

3.0.2

  • Removed the use of ImmutableSet.

3.0.1