SlideShare a Scribd company logo
Potentials and Limitations of
        Educational Datasets

        Hendrik Drachsler
            Open University of the Netherlands
Hendrik Drachsler
• Assistant professor at the Centre for Learning
  Sciences and Technologies (CELSTEC)
• Track record in TEL projects such as
  TENCompetence, SC4L, LTfLL, Handover, dataTEL.
• Main research focus:
   – Personalization of learning with information
     retrieval technologies, recommender systems and
     educational datasets
   – Visualization of educational data, data mash-up
     environments, supporting context-awareness by
     data mining
   – Social and ethical implications of data mining in
     education
• Leader of the dataTEL Theme Team of the
  STELLAR network of excellence (join the SIG on
  TELeurope.eu)
• Just recently: new alterEGO project granted by the
  Netherlands Laboratory for Lifelong Learning (on
  limitations of learning analytics in formal and
  informal learning)
dataTEL
Potentials and Limitations of Educational Datasets
24.07.2011 MUP/PLE lecture series, Knowledge Media Institute, Open University UK




Hendrik Drachsler                                                      #dataTEL
Centre for Learning Sciences and Technology
@ Open University of the Netherlands3
Goals of the lecture
 1.Motivation or dataTEL

 2.The dataTEL project

 3.Potentials of dataTEL

 4.Open issues of dataTEL


             4
TEL RecSys Research




         5
Survey on TEL Recommender




           6
Survey on TEL Recommender




Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).
Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F.
Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp.
387-415). Berlin: Springer.               6
Survey on TEL Recommender

      Observation:

      Half of the systems (11/20) still at design or prototyping stage
       only 8 systems evaluated through trials with human users.




Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).
Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F.
Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp.
387-415). Berlin: Springer.               6
Survey on TEL Recommender

    Observation:
  Conclusion:
  Small-scale experiments with a fewdesign or that rate some
     Half of the systems (11/20) still at learners prototyping stage
  resources only addsevaluated through trialsa knowledge base
      only 8 systems little contributions to with human users.
  on recommender systems and personalization in TEL.




Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).
Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F.
Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp.
387-415). Berlin: Springer.               6
The TEL recommender research is a
           bit like this...




                7
The TEL recommender research is a
           bit like this...
         We need to design for each domain an
appropriate recommender system that fits the goals, tasks,
                and particular constraints




                           7
But...
“The performance results
of different research
efforts in TEL
recommender systems
are hardly comparable.”

(Manouselis et al., 2010)
                                Kaptain Kobold
                                http://www.flickr.com/photos/
                                kaptainkobold/3203311346/




                            8
But...
“The performance results
The TEL recommender
of different research
experiments lack
efforts in TEL
transparency. They need
recommender systems
to be repeatable to test:
are hardly comparable.”
• Validity
(Manouselis et al., 2010)
• Verification
• Compare results               Kaptain Kobold
                                http://www.flickr.com/photos/
                                kaptainkobold/3203311346/




                            8
How others compare their
    recommenders




           9
How others compare their
        recommenders

Although the TEL domain stores plenty of
data everyday in e-learning environments
(LMS, PLEs) there is a lack of shareable
and publicly available datasets.



                    9
Goals of the lecture
1.Motivation or dataTEL

2.The dataTEL project

3.Potentials of dataTEL

4.Open issues of dataTEL


               10
Who is dataTEL ?
      dataTEL is a Theme Team funded by the
          STELLAR network of excellence



  Riina   Stephanie    Katrien     Nikos      Martin     Hendrik
Vuorikari Lindstaedt   Verbert   Manouselis   Wolpers   Drachsler




                                 11
Who is dataTEL ?
         dataTEL is a Theme Team funded by the
             STELLAR network of excellence



  Riina   Stephanie    Katrien      Nikos      Martin     Hendrik
Vuorikari Lindstaedt   Verbert    Manouselis   Wolpers   Drachsler

             MAVSEL                    CEN PT
                                       Social Data



  Miguel                     Joris
Angel Sicillia              Klerkx11
Who is dataTEL ?
         dataTEL is a Theme Team funded by the
             STELLAR network of excellence



  Riina   Stephanie    Katrien      Nikos      Martin     Hendrik
Vuorikari Lindstaedt   Verbert    Manouselis   Wolpers   Drachsler

             MAVSEL                    CEN PT
                                       Social Data



  Miguel                     Joris
Angel Sicillia              Klerkx11
dataTEL::Objectives
Make the research on TEL RecSys more comparable by
lowering the entrance barriers for other researchers and
increase the quality.

The required benchmarks therefore are:

1.A collection of public available datasets ranging from
  formal to non-formal learning settings

2.An overview of the research results of certain RecSys
  technologies on different datasets

3.A common approach to evaluate RecSys in the domain
  of TEL
                            12
dataTEL::Objectives
1.Collecting publicly available datasets
2.Sharing policy to (re)use and share datasets
3.Define dataset standards (documentation, pre-
  processing)
4.Address privacy and legal protection rights
5.Create evaluation criteria for TEL recommender
  systems
6.Create a body of knowledge on personalization in TEL


                          13
dataTEL::Collection




         14
dataTEL::Collection




        15
dataTEL::Collection




Drachsler, H., Bogers, T., Vuorikari, R., Verbert, K., Duval, E., Manouselis, N.,
Beham, G., Lindstaedt, S., Stern, H., Friedrich, M., & Wolpers, M. (2010). Issues
and Considerations regarding Sharable Data Sets for Recommender
Systems in Technology Enhanced Learning. Presentation at the 1st Workshop
Recommnder Systems in Technology Enhanced Learning (RecSysTEL) in conjunction
with 5th European Conference on Technology Enhanced Learning (EC-TEL 2010):
Sustaining TEL: From Innovation to Learning and Practice. September, 28, 2010,
Barcelona, Spain.                           15
dataTEL::Collection
   •Collected data is very different with
    respect to amount of users and
    resources
   •Most of the data is very sparse
   •Privacy regulations harm data sharing
   •Mostly data from R., Verbert, K., Duval, E., Manouselis, N.,
Drachsler, H., Bogers, T., Vuorikari, informal learning
    settings
Beham, G., Lindstaedt, S., Stern, H., Friedrich, M., & Wolpers, M. (2010). Issues
and Considerations regarding Sharable Data Sets for Recommender
Systems in Technology Enhanced Learning. Presentation at the 1st Workshop
Recommnder Systems in Technology Enhanced Learning (RecSysTEL) in conjunction
with 5th European Conference on Technology Enhanced Learning (EC-TEL 2010):
Sustaining TEL: From Innovation to Learning and Practice. September, 28, 2010,
Barcelona, Spain.                       15
dataTEL::Collection




         16
dataTEL::Collection




         16
dataTEL::Collection




         16
dataTEL::Collection




         16
dataTEL::Body of knowledge




Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,
(2011). Dataset-driven Research for Improving Recommender Systems for Learning. Learning
Analytics & Knowledge: February 27-March 1,17  2011, Banff, Alberta, Canada
dataTEL::Body of knowledge
                                                  Outcomes:
                                                  Tanimoto similarity +
                                                  item-based CF was
                                                  the most accurate.




Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,
(2011). Dataset-driven Research for Improving Recommender Systems for Learning. Learning
Analytics & Knowledge: February 27-March 1,17  2011, Banff, Alberta, Canada
dataTEL::Body of knowledge
                                                  Outcomes:
                                                  Tanimoto similarity +
                                                  item-based CF was
                                                  the most accurate.




Outcomes:
Implicit ratings like download
rates, bookmarks can
successfully used in TEL.



Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,
(2011). Dataset-driven Research for Improving Recommender Systems for Learning. Learning
Analytics & Knowledge: February 27-March 1,17  2011, Banff, Alberta, Canada
Goals of the lecture
 1.Motivation or dataTEL

 2.The dataTEL project

 3.Potentials of dataTEL

 4.Open Issues of dataTEL


             18
Potentials of Open Data




Example by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010
                                       19
Potentials of Open Data




Example by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010
                                       19
Data = New Science Paradigm
• Thousand years ago science was
 empirical (Describing natural
 phenomena)




                                 20
Data = New Science Paradigm
• Thousand years ago science was
 empirical (Describing natural
 phenomena)

• Last few hundred years science:
 theoretical branch (Using
 models, generalizations)




                                 20
Data = New Science Paradigm
• Thousand years ago science was
 empirical (Describing natural
 phenomena)

• Last few hundred years science:
 theoretical branch (Using
 models, generalizations)

• Last few decades: computational
 branch (Simulating complex phenomena)




                                 20
Data = New Science Paradigm
• Thousand years ago science was
 empirical (Describing natural
 phenomena)

• Last few hundred years science:
 theoretical branch (Using
 models, generalizations)

• Last few decades: computational
 branch (Simulating complex phenomena)
• Nowadays: data science
 (Unify theory, experiment, and simulation,
 data captured by instruments and processed
 by software, linked data)
                                 20
Promises of Open Data for TEL




              21
Promises of Open Data for TEL
Unexploited potentials for TEL:
• The evaluation of learning theories and learning
 technology from the data side

• More transparent, mutually comparable, trusted and
 repeatable experiments that lead to evidence-driven
 knowledge

• Development of new educational data tools / products
 that combine different data sources in data mashups

• Gain new insights / new knowledge by combining so far
 unconnected resources / tools
                             21
Data Products




      22
Data Products




      22
Data Products




      22
Data Products




W. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-based
dashboard for awareness-support in Research Networks. In Proceedings of the 2nd PLE
Conference, Southampton, UK, 2011.
                                          22
Data Products
      Educational Data Products
      • Drop-out Analyzer
      • Group Formation Recommender
      • Question-Answering Tool
      • Awareness Tools

W. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-based
dashboard for awareness-support in Research Networks. In Proceedings of the 2nd PLE
Conference, Southampton, UK, 2011.
                                          22
Goals of the lecture
1.Motivation or dataTEL

2.The dataTEL project

3.Potentials of dataTEL

4.Open issues of dataTEL


                   23
dataTEL::Open issues
1.Privacy
2.Prepare datasets
3.Share datasets
4.Body of knowledge



              24
Privacy




   25
Privacy




   25
Privacy

OVERSHARING




    25
Privacy

               OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to take the data from the web and present it
in that way?




                          25
Privacy

               OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to take the data from the web and present it
in that way?

Are we allowed to use data from social services and
reuse it for research purposes?



                          25
Privacy




   26
Privacy
1.Privacy as confidentiality
  The right to be let alone (Warren and Brandeis, 1890)




                           26
Privacy
1.Privacy as confidentiality
  The right to be let alone (Warren and Brandeis, 1890)

2.Privacy as control
  The right of the individual to decide what information
  about herself should be communicated to others and
  under which circumstances.




                            26
Privacy
1.Privacy as confidentiality
  The right to be let alone (Warren and Brandeis, 1890)

2.Privacy as control
  The right of the individual to decide what information
  about herself should be communicated to others and
  under which circumstances.

3.Privacy as practice
  The right to intervene in the flows of existing data and
  the re-negotiation of boundaries with respect to
  collected data.
                            26
Privacy solutions




        27
Privacy solutions
1.Privacy as confidentiality
  Information services that minimizing, secure or
  anonymize the collected information




                           27
Privacy solutions
1.Privacy as confidentiality
  Information services that minimizing, secure or
  anonymize the collected information

2.Privacy as control
  Identity Management Systems (IDMS),
  with access control rules




                           27
Privacy solutions
1.Privacy as confidentiality
  Information services that minimizing, secure or
  anonymize the collected information

2.Privacy as control
  Identity Management Systems (IDMS),
  with access control rules

3.Privacy as practice
  Timestamp on data, data degradation technologies


                           27
Prepare datasets




            Justin Marshall, Coded Ornament by
            rootoftwo
            http://www.flickr.com/photos/rootoftwo/
            267285816




       28
Prepare datasets
1. Create a dataset that
realistically reflects the
variables of the learning
setting.




                                 Justin Marshall, Coded Ornament by
                                 rootoftwo
                                 http://www.flickr.com/photos/rootoftwo/
                                 267285816




                            28
Prepare datasets
1. Create a dataset that
realistically reflects the
variables of the learning
setting.
2. Use a sufficiently large
set of user profiles


                                  Justin Marshall, Coded Ornament by
                                  rootoftwo
                                  http://www.flickr.com/photos/rootoftwo/
                                  267285816




                             28
Prepare datasets
1. Create a dataset that
realistically reflects the
variables of the learning
setting.
2. Use a sufficiently large
set of user profiles
3. Create datasets that
are comparable to others
                                  Justin Marshall, Coded Ornament by
                                  rootoftwo
                                  http://www.flickr.com/photos/rootoftwo/
                                  267285816




                             28
Prepare datasets
For informal data sets:

1. Collect data
2. Process data
3. Document data
4. Share data
For formal data sets
from LMS:

1. Data storing scripts
2. Anonymisation scripts
3. Document data
4. Share data

                           29
Prepare datasets




       30
Share/cite datasets




         31
Sharing policies




        32
Sharing policies




        32
Sharing policies




        32
Sharing policies




        32
Sharing policy guidelines




A brief guide on data licenses developed by SURF and the Centre for
Intellectual Property Law (CIER), 2009 available at
www.surffoundation.nl
                                          33
Body of knowledge
                         Datasets
Formal                                                Informal


   Data A                   Data B                   Data C



Algorithms:            Algorithms:            Algorithms:
Algoritmen A           Algoritmen D           Algoritmen B
Algoritmen B           Algoritmen E           Algoritmen D
Algoritmen C

Models:                Models:                Models:
Learner Model A        Learner Model C        Learner Model A
Learner Model B        Learner Model E        Learner Model C

Measured attributes:   Measured attributes:   Measured attributes:
Attribute A            Attribute A            Attribute A
Attribute B            Attribute B            Attribute B
Attribute C            Attribute C            Attribute C
                               34
Body of knowledge




        35
Body of knowledge




        35
Body of knowledge




        35
Body of knowledge




        35
Body of knowledge




        35
dataTEL::SIG
   http://www.teleurope.eu/pg/groups/9405/datatel/
Objectives:
• Representing dataTEL researchers to promote the release
  of open datasets from educational providers
• Fostering the standardizations of datasets to enable
  exchange and interoperability
• Contributing to policies on ethical guidelines (privacy and
  legal protection rights)
• Fostering a shared understanding of evaluation methods in
  TEL RecSys and Learning Analytics technologies.


                                36
Many thanks for your interests




                                              37
picture by Tom Raftery   http://www.flickr.com/photos/traftery/4773457853/sizes/l
Many thanks for your interests


                                       Free
                                     the data




                                              37
picture by Tom Raftery   http://www.flickr.com/photos/traftery/4773457853/sizes/l
Many thanks for your interests
   This silde is available at:
   http://www.slideshare.com/Drachsler

   Email:       hendrik.drachsler@ou.nl
   Skype:       celstec-hendrik.drachsler
   Blogging at: http://www.drachsler.de
   Twittering at: http://twitter.com/HDrachsler


                       38

More Related Content

KEY
dataTEL - Datasets for Recommender Systems in Technology-Enhanced Learning
PPTX
The Future of Open Science
PPTX
Open science, open data - FOSTER training, Potsdam
PDF
LAK13 linkedup tutorial_evaluation_framework
PPTX
From Open Data to Open Science, by Geoffrey Boulton
PDF
Building Capacity for Open Science
PPTX
The culture of researchData
PPT
Using Simulations to Evaluated the Effects of Recommender Systems for Learner...
dataTEL - Datasets for Recommender Systems in Technology-Enhanced Learning
The Future of Open Science
Open science, open data - FOSTER training, Potsdam
LAK13 linkedup tutorial_evaluation_framework
From Open Data to Open Science, by Geoffrey Boulton
Building Capacity for Open Science
The culture of researchData
Using Simulations to Evaluated the Effects of Recommender Systems for Learner...

What's hot (20)

PDF
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
PPTX
The Art and Science of Analyzing Software Data
PPTX
The Challenges of Making Data Travel, by Sabina Leonelli
PPTX
Rare (and emergent) disciplines in the light of science studies
PPTX
The Evolution of e-Research: Machines, Methods and Music
PPTX
Tablet Diffusion, Adoption and Implementation in Academic Libraries
PPTX
DH2012_Bellamy
PDF
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
PDF
Rda nitrd 2015 berman - final
PPTX
DREaM 5: Building evidence of the value and impact of library information ser...
PPTX
sciPADS presentation @ JURE conference 2014 in Nicosia, Cyprus
PPTX
Domain-specific Knowledge Extraction from the Web of Data
PDF
Introduction Presentation for LinkedUp kickoff meeting
PPTX
Science 2.0 and language technology
PDF
Mining and Understanding Activities and Resources on the Web
PPTX
Measuring Value and ROI of Academic Libraries: The IMLS Lib Value Project
PPT
Investigation the Effect of Users’ Tagging Motivation on the Digital Educatio...
PDF
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
PPTX
First Steps toward a Digital Scholarship Center
PDF
Open Access and Open Data: what do I need to know (and do)?
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
The Art and Science of Analyzing Software Data
The Challenges of Making Data Travel, by Sabina Leonelli
Rare (and emergent) disciplines in the light of science studies
The Evolution of e-Research: Machines, Methods and Music
Tablet Diffusion, Adoption and Implementation in Academic Libraries
DH2012_Bellamy
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Rda nitrd 2015 berman - final
DREaM 5: Building evidence of the value and impact of library information ser...
sciPADS presentation @ JURE conference 2014 in Nicosia, Cyprus
Domain-specific Knowledge Extraction from the Web of Data
Introduction Presentation for LinkedUp kickoff meeting
Science 2.0 and language technology
Mining and Understanding Activities and Resources on the Web
Measuring Value and ROI of Academic Libraries: The IMLS Lib Value Project
Investigation the Effect of Users’ Tagging Motivation on the Digital Educatio...
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
First Steps toward a Digital Scholarship Center
Open Access and Open Data: what do I need to know (and do)?
Ad

Viewers also liked (20)

PPT
Workplace etiquettes
PDF
Еmail vs Social — Евгений Вольнов
PDF
Launch of the EATEL SIG dataTEL at ECTEL 2011
PPT
A methodology to design customized learning networks
ODP
DOCX
Afectar al paciente
PPS
Kkka 2008 Halkegitimi
PPT
M02 un07 p02
PPT
M02 un10 p01
PDF
Unit 2.6
PPT
Unit 2.1 Part 4
PDF
Placer County Prefab Housing - EE in HOME Workshop
PPTX
Publiwide, eBooks as a service model
PPT
Thinking about thinking
PPT
What Do We Know About IPL Users?
PPT
Alliance Department June 2008
PPT
e-mail
PDF
Unit 2.12
PDF
Licensario — система оплаты для стартапов. in-app billing
Workplace etiquettes
Еmail vs Social — Евгений Вольнов
Launch of the EATEL SIG dataTEL at ECTEL 2011
A methodology to design customized learning networks
Afectar al paciente
Kkka 2008 Halkegitimi
M02 un07 p02
M02 un10 p01
Unit 2.6
Unit 2.1 Part 4
Placer County Prefab Housing - EE in HOME Workshop
Publiwide, eBooks as a service model
Thinking about thinking
What Do We Know About IPL Users?
Alliance Department June 2008
e-mail
Unit 2.12
Licensario — система оплаты для стартапов. in-app billing
Ad

Similar to Potentials and Limitations of Educational Datasets (20)

PDF
Recent Research and Developments on Recommender Systems in TEL
PPTX
Dataset-driven research to improve TEL recommender systems
PDF
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
PDF
Turning Learning into Numbers - A Learning Analytics Framework
PDF
Six dimensions of Learning Analytics
PPTX
De zes dimensies van learning analytics
PDF
Evaluation of Linked Data tools for Learning Analytics
PPTX
VII Jornadas eMadrid "Education in exponential times"."Maturing the learning ...
PPTX
Fighting level 3: From the LA framework to LA practice on the micro-level
PPT
The Dutch Approach to Research Data Infrastructure
PPTX
Research and Deployment of Analytics in Learning Settings
PPTX
Dutch Cooking with xAPI Recipes, The Good, the Bad, and the Consistent
PDF
Trusted Learning Analytics Research Program
PDF
Data Sets as Facilitator for new Products and Services for Universities
PDF
Inaugural lecture
PPTX
Towards Tangible Trusted Learning Analytics
PPTX
Open science at Opencamp
PPT
Sirtel Workshop
PPT
3rd Workshop on Social Information Retrieval for Technology-Enhanced Learnin...
PPTX
Technology and the Grand Challenge for Future Learning
Recent Research and Developments on Recommender Systems in TEL
Dataset-driven research to improve TEL recommender systems
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
Turning Learning into Numbers - A Learning Analytics Framework
Six dimensions of Learning Analytics
De zes dimensies van learning analytics
Evaluation of Linked Data tools for Learning Analytics
VII Jornadas eMadrid "Education in exponential times"."Maturing the learning ...
Fighting level 3: From the LA framework to LA practice on the micro-level
The Dutch Approach to Research Data Infrastructure
Research and Deployment of Analytics in Learning Settings
Dutch Cooking with xAPI Recipes, The Good, the Bad, and the Consistent
Trusted Learning Analytics Research Program
Data Sets as Facilitator for new Products and Services for Universities
Inaugural lecture
Towards Tangible Trusted Learning Analytics
Open science at Opencamp
Sirtel Workshop
3rd Workshop on Social Information Retrieval for Technology-Enhanced Learnin...
Technology and the Grand Challenge for Future Learning

More from Hendrik Drachsler (20)

PDF
Smart Speaker as Studying Assistant by Joao Pargana
PDF
Verhaltenskodex Trusted Learning Analytics
PDF
Rödling, S. (2019). Entwicklung einer Applikation zum assoziativen Medien Ler...
PDF
E.Leute: Learning the impact of Learning Analytics with an authentic dataset
PDF
Romano, G. (2019) Dancing Trainer: A System For Humans To Learn Dancing Using...
PPTX
Trusted Learning Analytics
PDF
LACE Project Overview and Exploitation
PPTX
Recommendations for Open Online Education: An Algorithmic Study
PDF
Privacy and Analytics – it’s a DELICATE Issue. A Checklist for Trusted Learni...
PDF
DELICATE checklist - to establish trusted Learning Analytics
PDF
LACE Flyer 2016
PPT
The Future of Big Data in Education
PDF
The Future of Learning Analytics
PDF
Ethics privacy washington
PDF
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
PPTX
The Impact of Learning Analytics on the Dutch Education System
PDF
LAK14 Data Challenge
PDF
Standardisierte Medizinische Übergaben - Wie lernen, lehren und implementiere...
PDF
R&D activites on Learning Analytics
PDF
Hoe ziet de toekomst van Learning Analytics er uit?
Smart Speaker as Studying Assistant by Joao Pargana
Verhaltenskodex Trusted Learning Analytics
Rödling, S. (2019). Entwicklung einer Applikation zum assoziativen Medien Ler...
E.Leute: Learning the impact of Learning Analytics with an authentic dataset
Romano, G. (2019) Dancing Trainer: A System For Humans To Learn Dancing Using...
Trusted Learning Analytics
LACE Project Overview and Exploitation
Recommendations for Open Online Education: An Algorithmic Study
Privacy and Analytics – it’s a DELICATE Issue. A Checklist for Trusted Learni...
DELICATE checklist - to establish trusted Learning Analytics
LACE Flyer 2016
The Future of Big Data in Education
The Future of Learning Analytics
Ethics privacy washington
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
The Impact of Learning Analytics on the Dutch Education System
LAK14 Data Challenge
Standardisierte Medizinische Übergaben - Wie lernen, lehren und implementiere...
R&D activites on Learning Analytics
Hoe ziet de toekomst van Learning Analytics er uit?

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
KodekX | Application Modernization Development
Network Security Unit 5.pdf for BCA BBA.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25 Week I
sap open course for s4hana steps from ECC to s4
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MYSQL Presentation for SQL database connectivity
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The Rise and Fall of 3GPP – Time for a Sabbatical?
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
“AI and Expert System Decision Support & Business Intelligence Systems”

Potentials and Limitations of Educational Datasets

  • 1. Potentials and Limitations of Educational Datasets Hendrik Drachsler Open University of the Netherlands
  • 2. Hendrik Drachsler • Assistant professor at the Centre for Learning Sciences and Technologies (CELSTEC) • Track record in TEL projects such as TENCompetence, SC4L, LTfLL, Handover, dataTEL. • Main research focus: – Personalization of learning with information retrieval technologies, recommender systems and educational datasets – Visualization of educational data, data mash-up environments, supporting context-awareness by data mining – Social and ethical implications of data mining in education • Leader of the dataTEL Theme Team of the STELLAR network of excellence (join the SIG on TELeurope.eu) • Just recently: new alterEGO project granted by the Netherlands Laboratory for Lifelong Learning (on limitations of learning analytics in formal and informal learning)
  • 3. dataTEL Potentials and Limitations of Educational Datasets 24.07.2011 MUP/PLE lecture series, Knowledge Media Institute, Open University UK Hendrik Drachsler #dataTEL Centre for Learning Sciences and Technology @ Open University of the Netherlands3
  • 4. Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open issues of dataTEL 4
  • 6. Survey on TEL Recommender 6
  • 7. Survey on TEL Recommender Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011). Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F. Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp. 387-415). Berlin: Springer. 6
  • 8. Survey on TEL Recommender Observation: Half of the systems (11/20) still at design or prototyping stage only 8 systems evaluated through trials with human users. Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011). Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F. Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp. 387-415). Berlin: Springer. 6
  • 9. Survey on TEL Recommender Observation: Conclusion: Small-scale experiments with a fewdesign or that rate some Half of the systems (11/20) still at learners prototyping stage resources only addsevaluated through trialsa knowledge base only 8 systems little contributions to with human users. on recommender systems and personalization in TEL. Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011). Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F. Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp. 387-415). Berlin: Springer. 6
  • 10. The TEL recommender research is a bit like this... 7
  • 11. The TEL recommender research is a bit like this... We need to design for each domain an appropriate recommender system that fits the goals, tasks, and particular constraints 7
  • 12. But... “The performance results of different research efforts in TEL recommender systems are hardly comparable.” (Manouselis et al., 2010) Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 8
  • 13. But... “The performance results The TEL recommender of different research experiments lack efforts in TEL transparency. They need recommender systems to be repeatable to test: are hardly comparable.” • Validity (Manouselis et al., 2010) • Verification • Compare results Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 8
  • 14. How others compare their recommenders 9
  • 15. How others compare their recommenders Although the TEL domain stores plenty of data everyday in e-learning environments (LMS, PLEs) there is a lack of shareable and publicly available datasets. 9
  • 16. Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open issues of dataTEL 10
  • 17. Who is dataTEL ? dataTEL is a Theme Team funded by the STELLAR network of excellence Riina Stephanie Katrien Nikos Martin Hendrik Vuorikari Lindstaedt Verbert Manouselis Wolpers Drachsler 11
  • 18. Who is dataTEL ? dataTEL is a Theme Team funded by the STELLAR network of excellence Riina Stephanie Katrien Nikos Martin Hendrik Vuorikari Lindstaedt Verbert Manouselis Wolpers Drachsler MAVSEL CEN PT Social Data Miguel Joris Angel Sicillia Klerkx11
  • 19. Who is dataTEL ? dataTEL is a Theme Team funded by the STELLAR network of excellence Riina Stephanie Katrien Nikos Martin Hendrik Vuorikari Lindstaedt Verbert Manouselis Wolpers Drachsler MAVSEL CEN PT Social Data Miguel Joris Angel Sicillia Klerkx11
  • 20. dataTEL::Objectives Make the research on TEL RecSys more comparable by lowering the entrance barriers for other researchers and increase the quality. The required benchmarks therefore are: 1.A collection of public available datasets ranging from formal to non-formal learning settings 2.An overview of the research results of certain RecSys technologies on different datasets 3.A common approach to evaluate RecSys in the domain of TEL 12
  • 21. dataTEL::Objectives 1.Collecting publicly available datasets 2.Sharing policy to (re)use and share datasets 3.Define dataset standards (documentation, pre- processing) 4.Address privacy and legal protection rights 5.Create evaluation criteria for TEL recommender systems 6.Create a body of knowledge on personalization in TEL 13
  • 24. dataTEL::Collection Drachsler, H., Bogers, T., Vuorikari, R., Verbert, K., Duval, E., Manouselis, N., Beham, G., Lindstaedt, S., Stern, H., Friedrich, M., & Wolpers, M. (2010). Issues and Considerations regarding Sharable Data Sets for Recommender Systems in Technology Enhanced Learning. Presentation at the 1st Workshop Recommnder Systems in Technology Enhanced Learning (RecSysTEL) in conjunction with 5th European Conference on Technology Enhanced Learning (EC-TEL 2010): Sustaining TEL: From Innovation to Learning and Practice. September, 28, 2010, Barcelona, Spain. 15
  • 25. dataTEL::Collection •Collected data is very different with respect to amount of users and resources •Most of the data is very sparse •Privacy regulations harm data sharing •Mostly data from R., Verbert, K., Duval, E., Manouselis, N., Drachsler, H., Bogers, T., Vuorikari, informal learning settings Beham, G., Lindstaedt, S., Stern, H., Friedrich, M., & Wolpers, M. (2010). Issues and Considerations regarding Sharable Data Sets for Recommender Systems in Technology Enhanced Learning. Presentation at the 1st Workshop Recommnder Systems in Technology Enhanced Learning (RecSysTEL) in conjunction with 5th European Conference on Technology Enhanced Learning (EC-TEL 2010): Sustaining TEL: From Innovation to Learning and Practice. September, 28, 2010, Barcelona, Spain. 15
  • 30. dataTEL::Body of knowledge Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E., (2011). Dataset-driven Research for Improving Recommender Systems for Learning. Learning Analytics & Knowledge: February 27-March 1,17 2011, Banff, Alberta, Canada
  • 31. dataTEL::Body of knowledge Outcomes: Tanimoto similarity + item-based CF was the most accurate. Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E., (2011). Dataset-driven Research for Improving Recommender Systems for Learning. Learning Analytics & Knowledge: February 27-March 1,17 2011, Banff, Alberta, Canada
  • 32. dataTEL::Body of knowledge Outcomes: Tanimoto similarity + item-based CF was the most accurate. Outcomes: Implicit ratings like download rates, bookmarks can successfully used in TEL. Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E., (2011). Dataset-driven Research for Improving Recommender Systems for Learning. Learning Analytics & Knowledge: February 27-March 1,17 2011, Banff, Alberta, Canada
  • 33. Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open Issues of dataTEL 18
  • 34. Potentials of Open Data Example by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010 19
  • 35. Potentials of Open Data Example by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010 19
  • 36. Data = New Science Paradigm • Thousand years ago science was empirical (Describing natural phenomena) 20
  • 37. Data = New Science Paradigm • Thousand years ago science was empirical (Describing natural phenomena) • Last few hundred years science: theoretical branch (Using models, generalizations) 20
  • 38. Data = New Science Paradigm • Thousand years ago science was empirical (Describing natural phenomena) • Last few hundred years science: theoretical branch (Using models, generalizations) • Last few decades: computational branch (Simulating complex phenomena) 20
  • 39. Data = New Science Paradigm • Thousand years ago science was empirical (Describing natural phenomena) • Last few hundred years science: theoretical branch (Using models, generalizations) • Last few decades: computational branch (Simulating complex phenomena) • Nowadays: data science (Unify theory, experiment, and simulation, data captured by instruments and processed by software, linked data) 20
  • 40. Promises of Open Data for TEL 21
  • 41. Promises of Open Data for TEL Unexploited potentials for TEL: • The evaluation of learning theories and learning technology from the data side • More transparent, mutually comparable, trusted and repeatable experiments that lead to evidence-driven knowledge • Development of new educational data tools / products that combine different data sources in data mashups • Gain new insights / new knowledge by combining so far unconnected resources / tools 21
  • 45. Data Products W. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-based dashboard for awareness-support in Research Networks. In Proceedings of the 2nd PLE Conference, Southampton, UK, 2011. 22
  • 46. Data Products Educational Data Products • Drop-out Analyzer • Group Formation Recommender • Question-Answering Tool • Awareness Tools W. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-based dashboard for awareness-support in Research Networks. In Proceedings of the 2nd PLE Conference, Southampton, UK, 2011. 22
  • 47. Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open issues of dataTEL 23
  • 49. Privacy 25
  • 50. Privacy 25
  • 52. Privacy OVERSHARING Were the founders of PleaseRobMe.com actually allowed to take the data from the web and present it in that way? 25
  • 53. Privacy OVERSHARING Were the founders of PleaseRobMe.com actually allowed to take the data from the web and present it in that way? Are we allowed to use data from social services and reuse it for research purposes? 25
  • 54. Privacy 26
  • 55. Privacy 1.Privacy as confidentiality The right to be let alone (Warren and Brandeis, 1890) 26
  • 56. Privacy 1.Privacy as confidentiality The right to be let alone (Warren and Brandeis, 1890) 2.Privacy as control The right of the individual to decide what information about herself should be communicated to others and under which circumstances. 26
  • 57. Privacy 1.Privacy as confidentiality The right to be let alone (Warren and Brandeis, 1890) 2.Privacy as control The right of the individual to decide what information about herself should be communicated to others and under which circumstances. 3.Privacy as practice The right to intervene in the flows of existing data and the re-negotiation of boundaries with respect to collected data. 26
  • 59. Privacy solutions 1.Privacy as confidentiality Information services that minimizing, secure or anonymize the collected information 27
  • 60. Privacy solutions 1.Privacy as confidentiality Information services that minimizing, secure or anonymize the collected information 2.Privacy as control Identity Management Systems (IDMS), with access control rules 27
  • 61. Privacy solutions 1.Privacy as confidentiality Information services that minimizing, secure or anonymize the collected information 2.Privacy as control Identity Management Systems (IDMS), with access control rules 3.Privacy as practice Timestamp on data, data degradation technologies 27
  • 62. Prepare datasets Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  • 63. Prepare datasets 1. Create a dataset that realistically reflects the variables of the learning setting. Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  • 64. Prepare datasets 1. Create a dataset that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  • 65. Prepare datasets 1. Create a dataset that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles 3. Create datasets that are comparable to others Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  • 66. Prepare datasets For informal data sets: 1. Collect data 2. Process data 3. Document data 4. Share data For formal data sets from LMS: 1. Data storing scripts 2. Anonymisation scripts 3. Document data 4. Share data 29
  • 73. Sharing policy guidelines A brief guide on data licenses developed by SURF and the Centre for Intellectual Property Law (CIER), 2009 available at www.surffoundation.nl 33
  • 74. Body of knowledge Datasets Formal Informal Data A Data B Data C Algorithms: Algorithms: Algorithms: Algoritmen A Algoritmen D Algoritmen B Algoritmen B Algoritmen E Algoritmen D Algoritmen C Models: Models: Models: Learner Model A Learner Model C Learner Model A Learner Model B Learner Model E Learner Model C Measured attributes: Measured attributes: Measured attributes: Attribute A Attribute A Attribute A Attribute B Attribute B Attribute B Attribute C Attribute C Attribute C 34
  • 80. dataTEL::SIG http://www.teleurope.eu/pg/groups/9405/datatel/ Objectives: • Representing dataTEL researchers to promote the release of open datasets from educational providers • Fostering the standardizations of datasets to enable exchange and interoperability • Contributing to policies on ethical guidelines (privacy and legal protection rights) • Fostering a shared understanding of evaluation methods in TEL RecSys and Learning Analytics technologies. 36
  • 81. Many thanks for your interests 37 picture by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
  • 82. Many thanks for your interests Free the data 37 picture by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
  • 83. Many thanks for your interests This silde is available at: http://www.slideshare.com/Drachsler Email: hendrik.drachsler@ou.nl Skype: celstec-hendrik.drachsler Blogging at: http://www.drachsler.de Twittering at: http://twitter.com/HDrachsler 38