Two days ago, at the codemotion Berlin 2013, we, i.e. Michael Lex and Ben Ripkens, gave a technology lab. We had an ambitious plan: In three hours we wanted to show typical agile development practices like test-driven development (TDD), acceptance test-driven development (ATDD), continuous integration (CI), continuous delivery (CD) and more while having the attendees work on a small demo project. (read more…)
Wikipedia is not only a never ending rabbit hole of information. You start with an article on a topic you want to know about, and you end up hours later with an article that has nothing to do with the original topic you’ve looked up. And all the time, you’ve been just clicking your way from one article to another.
But from a different perspective, Wikipedia is probably the biggest crowd-sourced information platform with a built-in review process and as many languages as its users want it to be (despite the fact that, together with Google, it has almost completely ousted printed encyclopaedias). So if this is not Big Data, then what is (pardon my sarcasm)?
The whole code including a step by step usage instructions is out on GitHub: https://github.com/pavlobaron/wpcorpus. Any constructive feedback and help are welcome.
In this post I want to share some experiences in the field of “Machine Learning” my current project pointed me to lately. I will focus myself on “Data Classification” with the tool RapidMiner and give an overview of the topic. Especially I would like to share how you can use this “stuff” from your Java application.