Migrating Grafana’s template variables from AngularJS to React: A tale of failures and wins
As many of you already know, we created Grafana using AngularJS, but we have been migrating to React for about two years now. One of the big missing pieces in our migration puzzle was the templating system. This post starts in late 2019 when I first got my hands on this mysterious and complex area of the Grafana code base.
Challenges with the old system
There was almost a feeling of starting an epic quest à la Lord of the Rings. The mystery was certainly there, and so were the many challenges that more experienced Grafanistas had warned me about.
One of the first things I walked into was the initialization of variables in the old system. I ended up writing this 38-line comment that describes the flow.
State mutated everywhere
In the old templating system, different components, services, and utilities were glued together. When you made a change somewhere in the old templating system, that change mutated all the pieces that were glued together. This is not necessarily a terrible thing as all the connected pieces are constantly updated, but it makes it a challenge to break apart.
The old system was elegantly designed using polymorphism, where several classes conformed to the same interface. Unfortunately, the system was not always used as a polymorphic system, and there were assumptions about the concrete implementations in some places.
Episode I: The one with the proxy approach
The first picture below shows a very simplified overview of what the old variable system looked like. The parent of all variables is a list in the dashboard model. When Grafana loaded a dashboard, that list was passed to a service called VariableSrv. VariableSrv was then responsible for all the heavy lifting around initializing variables so that other Angular services or controllers could use the initialized variables.
Every instance of a variable held its own state, and those were accessible through public properties.
The end goal here was to migrate the templating state handling to Redux and the UI components from AngularJS to React. The old templating system was really large and touched so many areas of Grafana, so the challenge became how to rewrite this in an incremental way.
Introducing the Proxy
The first attempt to migrate the old system was to introduce getters/setters to all public properties for all variable types. By introducing a getter and setter for a property, I could store the variable state in our Redux store with none of the Angular services, controllers, or directives knowing about this. The second picture below shows a simplified overview of what I was aiming for.
The thinking was that when the state was in Redux instead, I could more easily migrate the Angular UI to React.
Why it didn’t make it
The proxy approach felt kind of like a hack, and it was hard to foresee how long it would take to migrate all Angular UI before I could remove this hack(ish) approach. Also, it meant that Grafana’s resulting Redux store might not be as optimal as it could be.
Although this approach didn’t really fail, we decided after some discussion to try a fresh approach.
Episode II: The one with the top-down approach
The second attempt to migrate the old system was to migrate one variable type all the way from the UI down to the state.
I chose the most complex variable type query for this work. My thinking was that variable type would fail the fastest, but if it succeeded, it would cover the hardest and most complex areas. Furthermore, any migrated component could be used even if I failed.
The picture below shows an overview of how that looked.
You can see in the picture above that the variable type query is missing and is instead part of our Redux store. This was not enough, though, because I needed to design the system so that I could easily migrate other variable types, one after the other.
Introducing the adapter
So I needed to create a system that was flexible enough to migrate variable types one after the other.
I came up with this adapter pattern that I thought would solve this challenge. Grafana would keep a collection of adapters where I could add the existing variable types as I migrated them. Each adapter would contain these key properties:
- Reducer: handles state of the variable type that is not shared among all variable types
- Actions: actions for the variable type that are not shared among all variable types
- Picker: the UI component that is used to pick/change the value of a variable
- Editor: the UI component that is used to edit the definition of a variable
A simplified overview of the adapter is shown in the picture below.
Why it didn’t make it
The top-down approach didn’t fail as fast as I wanted; it took almost a month before I realized that this approach wouldn’t work.
First, migrating just one variable type meant that Grafana had to notify Angular when anything changed in the new system. The opposite was also true. Whenever something changed in Angular, Grafana needed to notify the new system. I did this using a Redux middleware, but that quickly became very complicated.
Second, I’d overlooked a very important feature in the variable system, namely how dependencies work. I had solved old system variables depending on the new system variables and the other way around, but there could be cases when old system variables would depend on new system variables and those would also depend on old system variables. As soon as I realized this limitation in our approach, I almost gave up on this quest.
The last episode: The one with the feature toggle approach
After my last failure I felt pretty bad about it. Fortunately for me, some time before that, Marcus Andersson had joined this quest. Thanks to Marcus, I regained my confidence, and we discussed our alternatives with Torkel Ödegaard, the founder of Grafana, and made one last attempt to migrate the templating system.
This time around, we decided we would not mix Angular and React/Redux but rather introduce a separate React/Redux system that could be turned on with a feature toggle. This gave us the following benefits:
- We could remove the complex middleware that handled notifications between Angular/Redux.
- We could reuse almost all of our code from Episode II.
- The feature toggle enabled us to merge our work to master even though we weren’t 100% feature complete.
We had some prior discussions about the design goals for this new attempt that I thought would be interesting to share:
- Reuse the adapter pattern from Episode II.
- Only actions could mutate state, a no-brainer. :)
- Make UI components contain as little complexity as possible, i.e., very simple components.
- Complexity belongs to actions and reducers and makes it easy to test. (Having everything in the state introduced other challenges – more about that below.)
- 1:1 migration of the UI. This limited the scope for this approach.
Our new React/Redux templating system
After less than a month, we had successfully built a new templating system based on React/Redux. Hurray, great success!
The biggest impact of the migration was of course all the great benefits we got from leaving AngularJS and using React/Redux. For instance, React/Redux will naturally introduce an architecture that separates the UI from the state. Previously the state was mutated everywhere, which was a big challenge.
We also moved (most of) the logic from the UI to thunks and state. Moving the logic means that more code is covered with tests, which will make it easier to review future pull requests. This will lead to higher quality code.
Also, having a centralized state in Redux will make it very easy to use the variable system in other places throughout Grafana in the future.
We also discovered some challenges that we could solve – and some that are still unsolved today.
How do you write good mock data?
It became apparent to us when we started writing tests that we needed a common way to create mock variable data. Instead of using some existing framework, Marcus came up with a simple yet elegant builder pattern solution.
How do you write Thunk tests when everything is using Redux state?
The one drawback with having everything in the state is when you try to test thunks. Almost every thunk we had made calls to getState, and our state was dependent on the results of other thunks or actions.
We talked about it and came up with two alternatives:
- We could mock getState.
- We could use “real“ reducers and store.
Mocking getState would be the best solution for the simplest thunks, but with most of our thunks it would introduce some complexity. We have thunks that make several getState calls, which means that our mock would have to mimic that.
One could argue that we should split up those thunks into smaller thunks with only one getState call, and that is still something that is worth considering.
We ended up creating a test utility we called reduxTester that gave us a way to use our “real” store and reducers when testing our thunks. You can say reduxTester became an integration test framework for our Redux store.
This way we didn’t need to mock state for a particular thunk, but we needed to call the corresponding actions in the correct order to mock state instead.
There’s probably a better way to solve this issue, but this is the way we solved it.
How should plugins access state and state changes?
One challenge that we still haven’t looked into is how plugins in the Grafana ecosystem should communicate with the new variable system.
This is not a new challenge, and there hasn’t been a clear documented way of doing this in the old variable system either, so this challenge remains to be prioritized. Here are a couple of discussions in the community site:
Specifically, the service TemplateSrv that is used a lot in the Grafana plugin ecosystem is still a mix between the old variable system design and the new.
While we’ve reached one goal on this quest – we successfully built a new templating system based on React/Redux – there are others that are within reach.
When this post goes public, all that remains of the old variable system is TemplateSrv. So we need to figure out and document the way plugins in the Grafana ecosystem should communicate with the variable system.
Along the way, we’ve learned some lessons too.
It’s very hard to migrate parts of a complex system from one tech stack to another.
It can be a good idea to hide new functionality behind a feature toggle.
And last but not least, whenever you start any epic journey, make sure you have others join the fellowship. It makes it a lot easier and more fun.
Related Case Studies
In a company where metrics is an important part of their culture, Wix selects Grafana Cloud to monitor its mission critical systems.
The company relies on Grafana to be the consolidated data visualization and dashboard solution for sharing data.