We are still a few datasets missing for Open-Question Answering which is currently a field in strong development.
Namely, it would be really nice to add:
- WebQuestions (Berant et al., 2013) [done]
- CuratedTrec (Baudis et al. 2015) [not open-source]
- MS-MARCO (NGuyen et al. 2016) [done]
- SearchQA (Dunn et al. 2017) [done]
- FEVER (Thorne et al. 2018) - [ done]
All these datasets are cited in http://arxiv.org/abs/2005.11401