Data science, machine learning, and deep learning techniques have been used to make tremendous advances in a wide variety of challenging problems over the last decade. However, these advances are overwhelmingly clustered in domains where the needed training datasets could be readily curated (e.g., natural language processing, computer vision, recommendation systems, translation, etc.). Capitalization on these transformative techniques in the sciences and engineering has been bottlenecked by the expense of curating training data. This is especially true in Materials Informatics where the number of different possibilities -- combinations of material chemistries, microstructures, and processing pathways -- is overwhelming and the cost of collecting data is tremendous.
We present MICRO2D to support the continued growth of big data efforts in Materials Informatics. MICRO2D is a statistically diverse, big heterogeneous microstructure dataset containing
Additional localized fields are available upon request, but are not included due to memory constraints. Most importantly, the dataset contains microstructures displaying an extremely large diversity of 2-point statistics and local neighborhood distributions. As a result, MICRO2D provides a valuable environment for studying the connection between microstructure features and important properties and behaviors. The dataset is hosted permanently on the following google drive (the link will be fixed when the paper is published).
MICRO2D is provided for use under a CC-BY 4.0 License. If you use this dataset in your research, please cite us using the following reference:
Robertson, A.E., Generale, A.P., Kelly, C., Buzzy, M.O., Kalidindi, S.R. MICRO2D: A Large, Statistically Diverse, Heterogeneous Microstructure Dataset . Integrating Materials and Manufacturing Innovation, (2024).
Additionally, please consider reading our other papers on microstructure generation.
Buzzy, M.O., Robertson, A.E., Kalidindi, S.R. Statistically Conditioned Polycrystal Generation Using Denoising Diffusion Models . In Review: Acta Materialia, (2024).
Robertson, A.E., Kelly, C., Buzzy, M.O., Kalidindi, S.R. Local-Global Decompositions for Conditional Microstructure Generation . Acta Materialia, 253, 118966 (2023).
Robertson, A.E. and Kalidindi, S.R. Efficient Generation of Anisotropic N-Field Microstructures from 2-Point Statistics using Multi-Output Gaussian Random Fields . Acta Materialia, 232, 117927 (2022).
The MICRO2D dataset is composed of 87,379 2-phase microstructures. We refer the interested reader to the first paper listed above -- "MICRO2D: A Large, Statistically Diverse, Heterogeneous Microstructure Dataset" -- for a thorough analysis of the dataset's content. The contained microstructures display both local and global diversity. Locally, the microstructures are separated into 10 important classes. Each class is defined by a unique set of local features. Globally, the microstructures' individual features are arranged in a wide diversity of spatial patterns. Mathematically, this corresponds to diversity in their 2-point statistics. The website's banner image contains examples microstructures with different local and global characteristics. Additionally, each microstructure in the dataset comes with several computed properties. Primarily, the microstructures come with several computed mechanical (elastic - Ex, Ey, vxy, vyx, Gxy) and thermal homogenized values (kx, ky). The values are reported for 6 different combinations of constituent properties for the two phases. These leaves a total of 42 homogenized properties per microstructure. Additionally, the dataset comes with spatial elastic strain fields for the (1 GPa:1000 GPa) high contrast case. Strain fields are included for two boundary conditions: unaxial tension (X) and pure shear. Strain fields for the remaining constituent combinations are available from the authors.
The dataset is stored in a HDF file format. We recommend accessing the files using the python package H5py. Once opened, the dataset's microstructures are separated into 10 groups corresponding to their classes. The available classes are:
The microstructures, properties, and important metadata for each class are contained within each group. The following schematic exemplifies the structure of the group for the 'GRF' class and the homogenized properties.
The MICRO2D dataset is released under the CC-BY 4.0 license. By downloading and utilizing the provided datasets, the user agrees to the contents of this license. The user is free to copy, redistribute, and transmit the dataset without obtaining specific permission from the original authors provided that proper attribution is given to the MICRO2D dataset source. For guidance on proper attribution, please see the section on citation above.