
🔍 What This Paper Introduces
Multiple imputation is a widely used, principled approach for handling missing values but often breaks down on very large or complex datasets. MIDAS (Multiple Imputation with Denoising Autoencoders) offers an accurate, fast, and scalable alternative by adapting a class of unsupervised neural networks—denoising autoencoders—to the imputation task.
🧠 How MIDAS Works
MIDAS repurposes denoising autoencoders by treating missing entries as an extra type of corruption. The model is trained to reconstruct the originally observed data while the missing entries are treated like corrupted inputs. Imputations are then drawn from the trained model that minimizes reconstruction error on the observed portion of the data.
📋 Key Features and Procedure
📈 Tests on Simulated and Real Social Science Data
Systematic evaluations include both simulations and empirical social science datasets. An applied example uses a large-scale electoral survey to demonstrate performance in a real-world setting.
⚙️ Practical Takeaways and Tools
🔎 Why It Matters
MIDAS bridges principled multiple imputation and modern deep learning, offering political scientists and social researchers a scalable tool to handle missing data in large surveys and complex datasets without sacrificing accuracy.

| The MIDAS Touch: Accurate and Scalable Missing-Data Imputation With Deep Learning was authored by Ranjit Lall and Thomas Robinson. It was published by Cambridge in Pol. An. in 2022. |
