BIM-224 Research Infrastructures 23
Materials and Tasks for the module "BIM-224, SoSe 2023, Blümel/Rossenova" for students at Hochschule Hannover. The materials are prepared with several colleagues from the Open Science Lab at TIB Hannover.
Session 1: Data harvesting interfaces / data collection
editSlides are available here: https://docs.google.com/presentation/d/1IxRTQhTY8nwFaijHq78m0NvW6Qw_YAj3YQtyO9Nn6dg/edit?usp=sharing
Student homework task pages
edit- Gizem Ergün / https://de.wikiversity.org/wiki/Benutzer:ErGiz
- BIM-224, SoSe 2023 - Ahmad Hasan Ahmad
- User:Ahmad.Aroud - Wikiversity
- BIM-224, SoSe 2023 – Lisa Sommer
- Mohammad Darkhbani
- Anna Rahr
- https://beta.wikiversity.org/wiki/BIM-224,_SoSe_2023_–_Josef_Debase
- Memo Loran Tuku
- Marcel Kromm
Group task 1
editPlatform list
edit- Radar4Culture
- GNM catalog
- Forschungsbibliothek Gotha der Universität Erfurt
- Datenportal des MfN Berlin
- Herbarium Berolinense
- Sketchfab
- Porta Fontium
- Coding da Vinci
Type of API list
edit- OAI-PMH, example: https://dhb.thulb.uni-jena.de/oai/prints?verb=Identify
Group task 2
edit- Student name / dataset link
- Josef Debase / https://creating-new-dimensions.org/Herbarium/
- Gizem Ergün / https://creating-new-dimensions.org/mkg-in-3d/
- Lisa Sommer / Pflanzenbelege aus dem Botanischen Garten Berlin
- Ahmad Aroud /https://codingdavinci.de/node/2284
- Marcel Kromm / Schrott or not? – "Geschmacksverirrungen" aus der Zeit um 1900
- Ahmad Hasan Ahmad / HISTORISCHER PORTRÄTAUFNAHMEN
- Mohammad Darkhbani / https://creating-new-dimensions.org/Gothaer-Kunstkammer/
- Memo Loran Tuku / https://creating-new-dimensions.org/Porta-fontium/
- Jana Cornelius / SchauMichAn - Franz Seraph Stirnbrand (um 1788-1882) und seine Porträts der Stuttgarter Gesellschaft
- Anna Rahr / Raritäten aus der Sammlung der Historischen Kommunikation der Robert Bosch GmbH
Session 2: Data cleaning, reconciliation and enrichment
editSlides are available here: https://docs.google.com/presentation/d/1HpXUXYcs-LDOQYuQzFv1SYYutKUYP8qG0mR5BG3fLyw/edit?usp=sharing
OpenRefine official documentation:
edithttps://openrefine.org/docs/manual/facets
https://openrefine.org/docs/manual/transforming
OpenRefine video tutorial:
editHomework presentations:
edit- Gizem Ergün / https://www.canva.com/design/DAFlSBYZBXM/yak3bDV2gREj--Isgx4Vpw/edit?utm_content=DAFlSBYZBXM&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton
- Lisa Sommer / Pflanzenbelege aus dem Botanischen Garten Berlin
- Jana Cornelius / https://de.wikiversity.org/wiki/BIM-224,_SoSe_2023_-_Janacn#Homework_Assignment_19.05.23
- Ahmad Hasan Ahmad / Screenshots vom Datensatz in OpenRefine - Link
- Memo Loran Tuku / Screenshots - Imgur Link
- Josef Debase / https://imgur.com/a/3cDRlZd
- Ahmad Aroud / https://imgur.com/a/NdYbqEa
- Marcel Kromm / Screenshots
Session 3: Data in Wikidata
editSlides are available here: https://docs.google.com/presentation/d/1bCilgycOApKcFjzelntD6zRf5WBU9804t_9Fb-Lc1E8/edit?usp=sharing
Homework presentations:
edit- Gizem Ergün / https://docs.google.com/presentation/d/1jdnIFpNe_P8d4SlN85J4MGHxTPgyaapMdopFyiwinDY/edit?usp=sharing
- Anna Rahr / https://imgur.com/a/AzJbEsD
- Memo Loran Tuku / https://miro.com/app/board/uXjVMEpcFxg=/?share_link_id=587248504323
- Josef Debase / https://imgur.com/a/YCJ1RP1
- Ahmad Hasan Ahmad / https://miro.com/app/board/uXjVMEvRMXE=/?share_link_id=326114468201
- Jana Cornelius / https://de.wikiversity.org/wiki/BIM-224,_SoSe_2023_-_Janacn#Homework_Assignment_26.05.23
- Ahmad Aroud / https://imgur.com/a/rRwFtRx
- Lisa Sommer / https://miro.com/app/board/uXjVME1u33Q=/?share_link_id=981671095755
- Marcel Kromm / Presentation
Session 4: Data Upload and querying (26.05)
editSlides are available here: https://docs.google.com/presentation/d/1ebFJXSKikUSyjjPIsXFwTqVV2-igku6ra5Vm83h5SWQ/edit?usp=sharing
Additional tutorials:
editComplete upload pipeline tutorial: https://en.wikiversity.org/wiki/OpenRefine_to_Wikibase%3A_Data_Upload_Pipeline
Upload tutorial for media files in Wikimedia Commons: https://en.wikiversity.org/wiki/Uploading_media_files_to_a_Wikibase_with_OpenRefine
Homework presentations:
edit- Student name / presentation link (google slides, other slide platform, or wiki pages with screenshots)
- Jana Cornelius / https://de.wikiversity.org/wiki/BIM-224,_SoSe_2023_-_Janacn#Homework_Assignment_09.06.23
- Josef Debase / https://imgur.com/a/w6OSArp
- Anna Rahr / Google Slides Link
- Marcel Kromm / https://imgur.com/a/i676VZ5
- ...
Session Workshop: Fermenting Data Workshop (02.06)
editSlides are available here: https://docs.google.com/presentation/d/1BHlO17nTTXccoPMgqXZBx46zuDvhnM52h5Wj9X8p36M/edit?usp=sharing
Wikibase instance:
edithttps://fermentingdata.wikibase.cloud/w/index.php?title=Special:CreateAccount&returnto=Main+Page
Session 5: Data upload and querying (cont.) / Data visualisation and presentation (09.06)
editVideo recording of the lecture: https://drive.google.com/file/d/1q94LdQauMPErzK5Yp2jD1zq0_MjWWgCX/view?usp=sharing
Slides are available here: https://docs.google.com/presentation/d/1T1fPDI2jSQJ1Q6rAaARIgxTbmST5Py_C8pmCBBAlXWQ/edit?usp=sharing
Book an individual feedback session - 15 mins per person:
edit- 15:00: Lisa Sommer
- 15:15: -
- 15:30: Ahmad Aroud
- 15:45: -
- 16:00: Anna Rahr
- 16:15: Gizem Ergün
- 16:30: Memo Loran Tuku
- 16:45: Josef Debase
- 17:00: Ahmad Hasan Ahmad
- 17:15: Jana Cornelius
- 17:30: -
Session 6: Data publication and review
editIn this session we will review homework and discuss requirements for final assignment submission.
Final submission deadline is July 7th.
Final assignment submission instructions
edit1) Spreadsheet with data you uploaded to Wikidata
2) Spreadsheet with the data you can download from the SPARQL endpoint with your main data query
3) Publication on GitHub Pages containing:
- your custom query results
- customized title / author / cover image
- customized additional text and optionally embedded data visualization as .svg and/or live results in an iframe.
Infos discussed during the session today
edit1) Adding proper Wikitext to Images in Commons when Uploading via OpenRefine
edit- A more detailed tutorial page, if you want to go more in-depth (esp. page 5 & 6): https://docs.google.com/document/d/1ENpZBOHvMESOst4Phh5gSRWlnAdBs-OMZt5j_cL-YGA/edit?usp=sharing
- For quick reference, I advise you to just check the screenshot here: and try to replicate in your schema builder when uploading. You need to make sure you have all of these statements for the images, in addition to the Wikitext. Depicts / Main subject link your image to the main object / artwork you uploaded to Wikidata.
- If you have photos of objects, you can use this simple Wikitext for all your photos (in addition to the statements as shown in the screenshot)
== {{int:filedesc}} == {{Art photo}} == {{int:license-header}} == {{CC-BY-4.0}}
Note to check the license – the above is just an example!
Note that if you copy my screenshot schema you will need to update the museum to match the museum you’re working with and license, too.
If you have photos of paintings / artworks, you can use this simple Wikitext for all your photos (in addition to the statements as shown in the screenshot)
== {{int:filedesc}} == {{Artwork}} == {{int:license-header}} == {{CC-BY-4.0}}
- More details are available in the google doc I shared above, but these instructions should be sufficient, too.
2) Using OpenRefine online
edit- There is actually an online version of OpenRefine! It is a bit old and does not have all new functionalities, e.g. you can’t upload images with it, but other than that it can be helpful in cases when you can’t use it on a personal or institutional computer for technical or other reasons. You need to go here: hub-paws.wmcloud.org and log in with your Wikimedia account. Then select OpenRefine from the set of tools available.
3) Issues with SPARQL queries, e.g. removing multiple line results for same item, etc.
edit- You can use a group_concat clause to concatenate multiple values in a single column, in order to avoid duplication of the same item over multiple lines, e.g. see this example: https://w.wiki/6qbP
- If you need more help customizing your queries, you can ask your peers, ask ChatGPT (though do not rely on it too much, it is still not very good with SPARQL and you have to be a magician with the prompts to get it all correct), or you can always consult trusted sources like StackOverflow and this very helpful SPARQL learning page on Wikidata - https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples
Final updates regarding publications:
editFor reference, you can have a look at the publications of your peers, or you can also double-check my own publication, which exemplifies different parts of the assignment.
- published view here: https://lozanaross.github.io/catalogue-003/
- Github code view here: https://github.com/lozanaross/catalogue-003
FINAL SUBMISSION
editSend the spreadsheets to the instructor via email.
Add your name & link to your publication below:
- Jana Cornelius / https://janacnl.github.io/catalogue-003/
- Ahmad Hasan Ahmad / https://ahmad19111.github.io/catalogue-003/
- Ahmad Aroud / https://ahmadaroud.github.io/catalogue-003/
- Anna Rahr / https://calnfynn.github.io/catalogue-003/
- Memo Loran Tuku / https://mloran.github.io/catalogue-003/
- Mohammad Darkhabani / https://mohammad19921991.github.io/catalogue-003/
- Lisa Sommer / https://pgxe9zu1.github.io/catalogue-003/