By: Eric Roche, Chief Data Officer, Office of the City Manager
At its core, open data is about providing data that anyone can access, share, and use. Providing data is the foundation of all open data programs. The thing is, nobody (including KCMO) has really figured out how to deliver on the original promise of open data. A consistent question in the movement is “how can we make the vast troves of government data easier to use?” Cities everywhere are working on this, and we are all slowly making progress. Many cities have beautiful open data homepages, however, I’m particularly fond of New York’s design. In Kansas City, we were quoted $5,000 for a website that wouldn’t even come close to New York’s level of design andour office doesn’t have the web staff to maintain a design that complex. So, how can we make our users experience better without spending any money, hiring additional staff, or repurposing existing staff time?
One of the areas where I personally deliver value is in being a human data portal. In my head exists a mental map of nearly every quality open data set on OpenData KC. Working closely with these datasets has given me a good idea of their size, update frequency, granularity and features. I know which datasets have high quality maps, which fields can be linked to other datasets, and which datasets are most commonly used to solve real world problems.
When I’m at community meetings, questions are usually asked by residents in general terms. For example, “do you have data on crime?”. Thankfully, we do have data on crime, but simply saying “yes” is not the best customer experience. Instead, a (sometimes lengthy) conversation takes place where I ask a series of follow-up questions. Ultimately, I’m trying to figure out what problem this user is trying to address with data. What is their real question, and what can I provide them the best data to answer that question?
If this resident were to search “crime” on the OpenData KC page they would get 160 different results. How do they know where to look? Unless they are deeply familiar with city data (unlikely) or enjoy searching through 160 results (also unlikely) they probably give up, or click what is at the top of the list, or google it and hope something better comes up, or call 311, or tweet at the mayor. What isn’t going to happen is them finding the best answer to their actual question – because data portals aren’t made to answer questions.
However, when working with actual people, I can ask things like “Well, what type of crime data are you interested in? Do you want high level trends like number of homicides, a list of crimes in your neighborhood, or a map with the latest crimes?” This conversational approach leads to much better outcomes for them because I can get them the dataset that matches their needs. By hearing the options available to them they can further refine their question. However, there is only one of me; I can only be in one location at a time, and because I sleep, I am unavailable to respond to a 2AM query.
So, with no budget, no additional staff, and no ability to make more hours in the day, how can we try and anticipate users’ questions and guide them to the right data?
We decided that an interesting approach would be to try and build a chatbot that could do this for us. A chatbot for open data hadn’t been built before, but the technology has come a long way in the past few years. Furthermore, there are quite a few platforms out there that would allow us to build one with little effort. We decided to try it as an experiment.
The idea of creating an always-available, instant-response, customer service focused bot that could apply all the things I’ve learned from conversations with residents was intriguing. Please, don’t read too deeply into the fact that I’m trying to automate human interaction.
For something that it meant to be an experiment, we weren’t going to overthink procurement. After researching different platforms, we decided to go with Chatfuel because it was 1) well reviewed and 2) free. I’ve gotten a lot of questions about how I chose a platform to build out on. It wasn’t a hard decision because it didn’t really matter. If it didn’t work well I would have just switched to a different one. It’s not like this bot was going to attract a million users, become sentient, and fight Elon Musk for AI supremacy. It’s a chatbot that guides users to datasets – a relatively simple goal, in my opinion. It’s an experiment, and I just wanted to build it – not dilly dally in procurement stasis.
So, we built a bot. It was fairly easy. No coding was one of my original rules. Sure, there are some cool coding packages out there that we could use to build a better bot, but why spend 10x the effort on something that you aren’t sure will even work? Chatfuel works by snapping together little blocks of text. Essentially it starts with “if they say this” then “show them this block”. And on and on and on. Chatfuel’s platform is usually smart enough to know that “crime” and “crimes” are synonymous. However, it’s not sophisticated enough to know that if someone types in “murder” to take them to the crime block. So, we spent some time setting up some synonym lists that could try and get them to the right chat block based on what we thought people might search. It’s not perfect. Not by a longshot. However, we were able to make “murder” lead to “crime”. The problem is you can never predict what people are going to type, and the technology isn’t advanced enough to interpret what you want to take the user based off what they type. I’m not sweating this. Amazon, Google, and other companies are closer, but they haven’t figured this out yet either. The bot was built out as my project while attending weekly Code for Kansas City meetings and probably took 30 hours total, and I’m pretty happy with the ROI.
At this point we deployed the bot to Facebook by clicking a few buttons. Then I paid $1 for a cute robot logo from thenounproject. We used a robot as our display picture because it’s a bot. Get it? Truly earthshattering levels of creativity happening here. For the public spending watchdogs, don’t worry, I didn’t bill the City for logo reimbursement because I am an exceedingly generous public servant. Also, the reimbursement form takes too long to fill out.
At this point, we had a working bot. We unveiled it at a KCStat meeting while reporting out on the open data program’s progress. I’m happy with how the bot is being utilized. Honestly, I expected about seven people to ever use it. Instead, the reception has been very surprising. Other cities are interested in bots for similar purposes, and the bot has received a lot of media attention. Whether the bot has made open data more user friendly, I’m not sure. We haven’t received any feedback either way.
What has happened is that by providing a bot for people to mess with we have been able to start a larger conversation about the future of chatbots in city governments. It’s always useful to have something to respond to when presenting a new idea, and the open data chatbot fills that role nicely.
I’ve been receiving a lot of questions about the future of chatbots in government. Some of the challenges with growing this project include: scaling it to other cities, maintaining the bot’s knowledge as datasets change, placing the bot on additional platforms, and making it more conversational. I hope that someone else is able to see this bot for what it is, an alpha-version of what is possible, and will be able to create something even better. So, if you see this bot and think “I can do better”, please do it! Extra points to you if you can make it open-source.
You can chat with our bot at m.me/OpenDataKC