Skip to main content

CSV Connector

A Comma Separated Values (CSV) file is a simple text file that stores data in a tabular format, with each line (row) representing a record and each field (column) within that record separated by commas. CSV files are commonly used to store data, such as assets and inventory, that can be easily exported and imported into other programs, such as spreadsheets, databases, and data visualization tools.

Brinqa offers a CSV Connector to facilitate the ingestion of data from a CSV file. Assuming that your data is stored in a proprietary format that the existing connectors cannot accommodate, you have the option of exporting your data to CSV files, and then leveraging the CSV Connector to ingest your data into the Brinqa Platform for further processing. This integration empowers you to build a unified view of your attack surface and strengthen your cybersecurity posture.

This document details the information you must provide for the connector to assess your file and retrieve data. See create a data integration for step-by-step instructions on setting up the integration.

Connection settings

When setting up a data integration, select CSV Connector from the Connector drop-down. If you cannot find the connector in the drop-down, make sure that you have installed it first. You must provide the following information:

  • Server: To establish a secure tunnel between the Brinqa Platform and your machine, where the CSV file resides, you may need to install a Brinqa Agent on the machine and create a data server for it. If this is required, select the data server that you have created.

  • Data directory: Enter the fully qualified path to the data directory where the CSV files reside. Wildcards (*) are allowed in the path, e.g., /feed/vendors/vendor_*.csv or /feed/mobile/mobile_device*.*.

    note

    When multiple CSV files are targeted using a wildcard, the CSV Connector processes them based on their last modified time, from oldest to newest. This ensures that the most recent data takes precedence when there are duplicates.

  • Target: Specify the unified data model (UDM) to map your data to. The default selection is Record.

    For example, if your CSV file contains inventory data, you can select Host or any other data model that extends Asset. If you keep the default, Brinqa Platform creates the source data model (SDM) but does not map it to any UDM.

  • Identifier fields: (Optional) Using a comma-separated list, specify the fields that can be used as identifiers. To ensure that there are no duplicates, it is crucial to declare the fields in the correct order.

  • Unique fields: (Optional) Enter a comma-separated list of fields that, when combined together, represent a unique row. This is useful when your CSV file contains duplicates.

  • Numeric fields: (Optional) Enter a comma-separated list of fields that contain numbers. The format is field=type where field is the name of the field and type is the type of numbers, such as integer, double, or long.

  • Boolean fields: (Optional) Enter a comma-separated list of fields that contain boolean values. Example usage: Active, Internet Facing.

  • Date fields: (Optional) Enter a comma-separated list of fields that contain dates or date and times in the Joda-Time format. The format is field=format where field is the name of the field and format is the Joda-Time format used in the field. Please refer to the Joda-Time DateTimeFormat documentation for a full list of supported formats and patterns.

    For example, if you have a field named First found and it uses the MM/dd/yyyy format, and another field named Last Updated and it uses the MM-dd-yyyy format, you should enter First found=MM/dd/yyyy,Last Updated=MM-dd-yyyy.

    Common Joda-Time Formats:

    • yyyy-MM-dd: Four-digit year, two-digit month, two-digit day (e.g., 2024-06-17).

    • yyyy-MM-dd HH:mm:ss: Full datetime with hours, minutes, and seconds (e.g., 2024-06-17 14:30:00).

    • MM/dd/yyyy: Two-digit month, two-digit day, four-digit year (e.g., 06/17/2024).

    • dd-MMM-yyyy: Two-digit day, short month name, four-digit year (e.g., 17-Jun-2024).

    • yyyyMMdd: Eight-digit date (e.g., 20240617).

    The CSV Connector also supports date fields in Epoch time formats, both in milliseconds and seconds:

    • For milliseconds, use the format: epochMillis.

    • For seconds, use the format: epochSeconds.

      Milliseconds:

      • epochMillis: 1609459200000: Represents 01/01/2021 @ 12:00 AM (UTC) in milliseconds.

      • epochMillis: 1609455600000: Represents 12/31/2020 @ 11:00 PM (UTC) in milliseconds.

      Seconds:

      • epochSeconds: 1609459200: Represents 01/01/2021 @ 12:00 AM (UTC) in seconds.

      • epochSeconds: 1609455600: Represents 12/31/2020 @ 11:00 PM (UTC) in seconds.

  • Multi-value fields: (Optional) Enter a comma-separated list of fields containing multiple values. The format is field=delimiter where field is the name of the field and delimiter is the delimiter used to separate the values.

    For example, if you have a field named Ratings and it contains "High: Medium: Low" as values; and another field named Compliance and it contains "FedRAMP; HIPAA; PCI" as values, you should enter Ratings=:,Compliance=;.

  • Multi-row fields: (Optional) Enter a comma-separated list of fields that contain values spanning across multiple rows.

    In the following example, the IP address for host "h.brinqa.net" is specified across two rows:

    Hostname, IP address
    h.brinqa.net, 10.0.0.0
    h.brinqa.net, 192.168.1.1

    After the data has been processed in the Brinqa Platform, the hostname attribute displays "h.brinqa.net" and the ipAddress attribute contains ["10.0.0.0", "192.168.1.1"]. To prevent any duplicates in the ipAddress attribute, you must ensure that the IP address field contains unique values.

  • File encoding: Specify the encoding of the CSV file. The default encoding is UTF-8.

  • Text qualifier: Specify the qualifier that determines the start and end of a field. The default qualifier is the double quote (").

  • Field delimiter: Specify the delimiter for the fields. The default delimiter is comma (,).

  • EOL characters: Specify the character that represents the end of the line (EOL). The default character is CRLF.

  • Failure threshold: Specify the acceptable number of failed rows before processing of the CSV file is halted. A value of -1 indicates that an infinite number of failures are acceptable. When the threshold is reached, the CSV Connector halts the ingestion process, allowing you to address any issues before resuming.

  • Max age: Specify the maximum number of days to keep the CSV file in the Brinqa Platform. Any value less than 0 indicates that the file does not expire, and 0 indicates not to keep the file. The default is -1.

  • Max files: Specify the maximum number of CSV files to keep in the data directory after they have been processed. Any value less than 0 indicates that there is no limit on the number of files to retain, and 0 indicates not to keep any file. The default is -1.

  • Rename or move the file after it's processed: Select this option to rename or move the file after it has been imported into the Brinqa Platform.

    tip

    If you enable this option, after a CSV file has been ingested, the connector renames the file by appending .processed to the file name. This ensures that the same file won't be ingested multiple times in subsequent sync operations.

Types of data to retrieve

The CSV Connector retrieves records from CSV files and maps them to the data model you specify in the Target field.

info

The CSV Connector does not currently support operation options for the types of data it retrieves.

For detailed steps on how to view the data retrieved from the CSV Connector in the Brinqa Platform, see How to view your data.

Sample CSV file

Below is an example of a simple CSV file, compatible with the CSV Connector, that represents package information:

name,lastSeen,owners,vendor,versionName
Google Chrome,2024-01-15 08:00:00,John Doe;Jane Smith,Google,91.0.4472.124
Microsoft Office,2024-02-20 08:00:00,Alice Johnson;Bob Brown,Microsoft,16.0.13901.20462
Adobe Acrobat Reader,2024-03-05 08:00:00,Charlie Davis;Eve Adams,Adobe,2021.001.20145

Here is a table of the above CSV file information for simplified viewing:

namelastSeenownersvendorversionName
Google Chrome2024-01-15 08:00:00John Doe;Jane SmithGoogle91.0.4472.124
Microsoft Office2024-02-20 08:00:00Alice Johnson;Bob BrownMicrosoft16.0.13901.20462
Adobe Acrobat Reader2024-03-05 08:00:00Charlie Davis;Eve AdamsAdobe2021.001.20145

In the above example, each row represents a record with the following attributes:

  • name: The name of the package.
  • lastSeen: The date and time when the package was last seen, in Joda-Time format yyyy-MM-dd HH:mm:ss.
  • owners: A semicolon-separated list of owners for the package.
  • vendor: The vendor of the package.
  • versionName: The version name of the package.

If your CSV files contain more complex data, ensure each field is properly formatted. If you use multi-value fields, you must specify the delimiters for these fields in the connector settings.

APIs

As the CSV Connector is file-based, it doesn't rely on any API endpoints and thus, doesn't offer any operation options.

Changelog

The CSV Connector has undergone the following changes:

3.0.9

  • Enhanced date time support by changing the data type of date attributes from Long to Instant.

3.0.8

  • Updated dependencies.

3.0.7

  • Updated dependencies.

3.0.6

  • Added support for Epoch seconds and milliseconds.

3.0.5

  • Fixed an issue where importing the same CSV file twice resulted in duplicated records.

3.0.4

  • Stored the value 'false' in a Boolean field if the retrieved value is anything other than 'true'.

3.0.3

  • Introduced configuration on a Boolean field to effectively handle boolean types.

3.0.2

  • Removed the use of ImmutableSet.

3.0.1