H-MOG Data Set: A Multimodal Data Set for Evaluating Continuous Authentication Performance in Smartphones

Qing Yang, Ge Peng, David T. Nguyen, Xin Qi, Gang Zhou (Colleget of William and Mary)
Zdeňka Sitová (New York Institute of Technology; Masaryk University)
Paolo Gasti, Kiran S. Balagani (New York Institute of Technology)

1. Introduction

We performed a large-scale user study to collect a wide spectrum of signals about user behaviors on smartphones, including touch, gesture, and pausality of the user, as well as movement and orientation of the phone. This dataset has been used to evaluate a continuous authentication modality named H-MOG in smartphones. A detailed description of this dataset and its application is in our poster paper (PDF) in ACM SenSys'14. The H-MOG paper using this dataset is published on IEEE Transactions on Information Forensics and Security (link on IEEE Xplore).

2. Data Collection Tool

We developed a data collection tool for Android phones to record real-time touch, sensor and key press data invoked by user's interaction with the phone. Data from three usage scenarios on smartphones were recorded: (1) document reading; (2) text production; (3) navigation on a map to locate a destination.

3. Data Collection Process

We recruited 100 volunteers for a large-scale data collection. When a volunteer logs into the data collection tool, s/he is randomly assigned a reading, writing, or map navigation session. For each session, the volunteer either sits or walks to finish the tasks. One session lasts about 5 to 15 minutes, and each volunteer is expected to perform 24 sessions (8 reading sessions, 8 writing sessions, and 8 map navigation sessions). In total, each volunteer in our experiments contributed about 2 to 6 hours of behavior traits.

4. Dataset Content

The following 9 categories of data are recorded:

  1. Accelerometer
  2. Gyroscope
  3. Magnetometer
  4. Raw touch event
  5. Tap gesture
  6. Scale gesture
  7. Scroll gesture
  8. Fling gesture
  9. Key press on virtual keyboard

The current dataset includes all the collected data from 100 volunteers. For each session, there are nine CSV files, each of which conresponds to one of the above data categories. There is another CSV file recording the meta-data of this session. The total size of this dataset is about 6GB after compression.

5. Publications and Citations


Zdeňka Sitová, Jaroslav Šeděnka, Qing Yang, Ge Peng, Gang Zhou, Paolo Gasti, Kiran S. Balagani. HMOG: New Behavioral Biometric Features for Continuous Authentication of Smartphone Users. In IEEE Transactions on Information Forensics and Security, vol.PP, no.99, pp.1-1.
BibTex file, link on IEEE Xplore Digital Library


Qing Yang, Ge Peng, David T. Nguyen, Xin Qi, Gang Zhou, Zdeňka Sitová, Paolo Gasti, and Kiran S. Balagani. A Multimodal Data Set for Evaluating Continuous Authentication Performance in Smartphones. In Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems (SenSys '14). ACM, New York, NY, USA, 358-359.
BibTex file, link on ACM Digital Library

6. Acknowledgement

This work was supported in part by DARPA Active Authentication grant FA8750-13-2-0266, NSF CAREER Grant CNS-1253506, and a 2013 NYIT ISRC grant. The views, findings, recommendations, and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the sponsoring agencies or the U.S. Government.

If you have any questions, please contact us: hmog.dataset@gmail.com

7. Dataset Download

If you want to obtain a copy of the dataset, please read the following terms. If you agree, click the agree button and you will be redirected to the download page.