The dataset page allows you to import, and later replace datasets you would like to use inside your project. You can access the Dataset page on the top nav bar highlighted below.

20842084

πŸ“˜

Data Format

Watchful only supports CSV data where the column names start with a capital letter, and only use alphanumeric characters and underscores.

  • Files must be in .csv format with a header row
  • Column names must be capitalized

Dataset Import Choices

Watchful currently supports .csv file types either locally from disk, or via our s3 integration.

Local CSV File

Once the file source is selected, Watchful will show you a preview of what your dataset will look like when loaded into the project. If the data looks as expected, you can confirm importing the dataset by clicking the "Import" button on the top right.

15261526

s3 Import

Watchful supports integration with any s3 API-capable storage via Watchful Hub (see Setting up Watchful Hub). To import a CSV file in an existing bucket, you will need:

  • Watchful Hub set up with appropriate AWS credentials.
  • To be logged in to the Application (see Watchful Hub Overview and User Roles)
  • The name of the bucket, and the path of the file.

πŸ“˜

For the given S3 URI: s3://meerkat-manor/season-1/cast.csv:

  • The S3 Bucket would be meerkat-manor
  • The S3 Path would be season-1/cast.csv

To begin, select Import from S3, then paste the S3 bucket and S3 Path into the Add S3 file to project fields.

15261526

Replacing a Dataset

You can replace the existing dataset in this project by selecting a new dataset to import. Replacing a dataset will permanently remove the current file from your project. Existing Hinters and Hand Labels will remain in the project. You will need to relabel for base rate via the Hand Label tab each time you replace a dataset so that Watchful gets an accurate representation of the classes in the dataset. More on how hand labels are used within Watchful here Hand Labeling

16201620

After replacing the datasets, you will have to:

β€’ Recalibrate your base rates by Hand Labeling.
β€’ Some hinters may need to be edited if they rely on specific attributes of the dataset being replaced.

Importing Hand Labels

Watchful supports importing hand labels with the exact column name, HandLabels, and in the specific format Watchful expects:

  • "Class1-Y" or "Class1-N" for a single class label,
  • "Class1-Y Class2-N Class3-Y" and each class should be separate by a space for multi-class labels.

❗️

Class names must be of the following format:

  • Class names must start with an uppercase letter
  • Can only contain alphanumerical characters or '_'
  • Cannot exceed 32 characters.

In addition, all "Y"/"N"'s must be capitalized.

This is also the format that we use to export hand-labels as part of the Exporting process.

When Watchful sees the HandLabels column in your dataset, it will automatically create hand labels out of all correctly formatted, hand-labelled candidates. If your dataset contains classes Watchful does not have already, Watchful will create those classes. To make sure your hand labels are imported successfully, you can:

  1. If your dataset contains hand labels for classes Watchful hasn't seen before, check to see that Watchful has created those classes.
  2. Use the handlabel <pos/neg> <Class-Name> query to see all hand labelled candidates, and verify how many there are (See Querying section for more details).

Limitations

  • File size is limited by the amount of RAM available in the machine
  • No matter how large the file, Watchful does a streamed import which allows you to start working immediately while the rest of the file is imported.

What’s Next

Read more about using hand labels, queries, and hinters

Did this page help you?