We present a method for hyperspectral pixel unmixing. The proposed method assumes that 1) abundances can be encoded as Dirichlet distributions and 2) spectra of endmembers can be represented as multivariate normal distributions. The method solves the problem of abundance estimation and endmember extraction within a variational autoencoder setting where a Dirichlet bottleneck layer models the abundances, and the decoder performs endmember extraction. The proposed method can also leverage the transfer learning paradigm, where the model is only trained on synthetic data containing pixels that are linear combinations of one or more endmembers of interest. In this case, we retrieve endmembers (spectra) from the United States Geological Survey Spectral Library. The model thus trained can be subsequently used to perform pixel unmixing on “real data” that contains a subset of the endmembers used to generate the synthetic data. The model achieves state-of-theart results on several benchmarks: Cuprite, Urban Hydice, and Samson. We also present a new synthetic dataset, OnTech-HSISyn-21, that can be used to study hyperspectral pixel unmixing methods. We showcase the transfer learning capabilities of the proposed model on Cuprite and OnTech-HSI-Syn-21 datasets. In summary, the proposed method can be applied for pixel unmixing in a variety of domains, including agriculture, forestry, mineralogy, analysis of materials, and healthcare. In addition, the proposed method eschews the need for labeled data for training by leveraging the transfer learning paradigm, where the model is trained on synthetic data generated using the endmembers present in the “real” data.
The Cuprite HSI dataset covers a region around Las Vegas, Nevada, USA, and comprises a 512×614, 188-channel HSI. The area under observation contains 12 minerals (or, for our purposes, endmembers): Alunite, Andradite, Buddingtonite, Dumortierite, Kaolinite1, Kaolinite2, Muscovite, Montmorillonite, Nontronite, Pyrope, Sphene, and Chalcedony. The Cuprite dataset lacks pixel-level abundance information that the proposed model needs for training. We explored transfer learning to deal with this issue. We constructed a Cuprite Synthetic dataset that uses the same materials as those found in the Cuprite dataset. The spectra for these materials were taken from the USGS spectral library. The model was trained on the Cuprite Synthetic dataset only, and the trained model was subsequently used to analyze the original Cuprite dataset.
The datasets used in this study are: OnTech-HSISyn-21, Cuprite (and Cuprite Synthetic), Samson, and Urban HYDICE dataset. Cuprite, Samson, and Urban HYDICE are standard benchmarks and these can be obtained from their respective sources. We provide OnTech-HSISyn-21 and Cuprite Synthetic below.
For technical details please look at the following publications