Intrusion detection dataset
A New Hope for DARPA OpTC:
Correction and labeling of the dataset DARPA OpTC
PIRAT\'); Team, 2025

Presentation

The DARPA OpTC dataset is an interesting dataset to design and evaluate intrusion detection systems (IDS), especially IDS based on machine learning. However, we discover some mismatches affecting unique identifiers in the dataset. Moreover, there is no official precise labeling of the dataset.
In this work, we fix the errors that we discovered in the dataset and provide a labeling of the dataset at the host and network levels.
All the scripts are available on https://gitlab.inria.fr/pirat/a_new_hope_for_darpa_optc}
The corrected dataset will be available as soon as we find a way to share 900 Gb.
Reference:
The paper associated with the dataset is : Majorczyk, F., Pilastre, B., Dijoud, F. (2025). A New Hope for DARPA OpTC. In: ACSAC 2025 International Workshops. https://doi.org/XXXXXX

Contact

If you have questions about this work, do not hesitate to send an email at frederic[.]majorczyk[\at]intradef[.]gouv[.]fr