23 October 2018

This article by Claire Parry first appeared on the CMP Content Services blog on 22 October 2018.

This was my second Taxonomy Boot Camp London (TBCL), attending as a representative of NetIKX, one of the supporters of the event. I had been impressed by the quality of talks at last year’s Boot Camp and the standard this year equalled, if not exceeded, that of the previous event.  It was fascinating to see how some of the topics of last year (such as ‘fake news’ and big data) were still current, but the conversation around these issues has moved on and deepened. 

Not just two sides to every story

The opening keynote from Paul Rissen expanded on the issues of truth, free speech and fact checking to explore ways in which information environments tackle these problems, with varying success: ‘fact checking’ alone does not work because ‘true v false’ is no longer the key issue. We are faced with the ‘Russian firehose of falsehood’ propaganda model whereby we are overwhelmed with so many different narratives that nobody knows what to believe or do. The aim is to distract and discourage: the well-meaning maxim of ‘question everything’ has been distorted into ‘trust no-one’. The artificial format which presents (only) two opposing sides to every story and aims to give equal weight to both, is still prevalent in broadcast journalism and ‘free speech’ is all too often understood to mean a lack of regulation, ignoring the fact that it is automatically easier for a powerful group to express its ideas freely. These issues impact on how we manage online environments, especially where value tends to be measured by the number of clicks and likes. The agile philosophy values ‘moving fast and breaking things’, but maybe we need to move fast and fix things! Community management needs to value diversity, respect the intelligence of users and reward positive behaviour rather than simply punishing negative behaviour. We should have ‘strong opinions, loosely held’ – there is nothing wrong with changing your mind in the face of new evidence.

Machines learn what we teach them

Machine learning was also a recurring theme, with several speakers addressing the limitations of reliance on algorithms and the ways in which taxonomy can enhance machine learning, an area which is still under-exploited. Machine learning and AI are valuable as they allow us to see patterns in information, but we need to be aware of the limits of machine classification – such as the well-publicised cases of algorithms ‘learning’ from and replicating human social and cultural biases. Machines are useful for pattern recognition and concept extraction, but selecting suitable training sets is not easy. We must remember that machines learn from us and we need to be careful what we teach them!

Taxonomy alone cannot fix enterprise search

The second day’s keynote, by Tom Reamy of KAPS Group, covered many of these issues and questioned some of the received wisdom about the value of taxonomies: it’s a good thing that we no longer need to explain what a taxonomy is, but the downside is that there are a lot of bad ones out there. Taxonomies are often seen as the solution to poor enterprise search, but tagging documents is laborious and users are resistant to being forced to choose tags. While taxonomies add structure to unstructured content, automatic tagging provides limited accuracy: a hybrid model combining taxonomy and text analytics is a better foundation for an enterprise solution.

 Information 4.0 and the Semantic Web

Linked data and the semantic web, while hardly new topics, seem to have had a resurgence in information management circles in the past few years, possibly related to the need for interoperability not just between systems but also between humans and machines.  Information 4.0 and the Internet of Things have provided fresh applications for Semantic Web standards and models, which are now being discussed in a commercial context rather than in purely academic circles. There is increased use of graph databases and a growing interest in extending taxonomies with ontologies and in sharing metadata across systems. Several speakers addressed the issue of standardising and mapping vocabularies across different domains and resources. The use of SKOS as a common data model for sharing and linking knowledge organisation systems was frequently mentioned. 

Why Taxonomy Boot Camp?
Taxonomy Boot Camp offers an excellent opportunity for anyone working with, or interested in, taxonomies, ontologies and related fields to brush up on their knowledge (for those new to the topic, there’s also a separate day of intensive workshops before the conference), to network with fellow professionals and to see real-world applications of what can appear a rather esoteric and abstract concept. Well done to the organisers for another stimulating and enjoyable event – and for anyone who didn’t make it this year, the next TBCL is scheduled for 15 & 16 October 2019.

More News