Transaction log tailing on CockroachDB
Looking at the previous post you can see that I made an important edit, the reason is that basically at the moment I wasn’t aware of CockroachDB Change Data Capture(CDC). This feature on this particular DB, really makes things a lot easier, the Transaction log Miner in the Transaction log tailing pattern, literally became the CDC in CockroachDB. In a very simple and practical way, you could literally redirect the logs related to a table into a sink(Kafka, google cloud Pub/Sub, etc…), although this particular integration feature it’s only available in the enterprise tier. Nevertheless, in the core free tier, we can still make the integration by ourself. Also we don’t have to change our previous connection setup, because CockroachDB it’s compatible with Postgresql, meaning I can use a Postgresql driver to connect to CockroachDB. So let’s me describe you the steps, and take this into action.
Requirements
We have the intention to try all this locally first, with a core tier, free forever. For the purpose of give a full demonstration I’ll use Google Cloud Pub/Sub as a sink waiting for messages on a topic called…wait for it…cats ;). So we will need:
- Cockroach, you can find the installation detail here ==> Install CockroachDB on Linux. You can find the installation for other OS in there as well.
- [Optional] Have a Google Cloud Account, I think this it’s optional, because you can just print the message at the start and later decide which broker to use.
Spinning up CockroachDB
Fairly simple
cockroach start-single-node --insecure
Reading the logs that this command generate you will notice the following address, which are important:
- webui, this is local web server where you can interact with Cockroach through the Browser.
- sql, just the standard DSN that you will be used for your connection with the Database.
[Optional] Connecting to cockroach from the terminal
In case you want to explore it, and type some commands, you can connect to it very simple,
cockroach sql --url="{DSN}" --format=csv
In this last command take into account that DSN, is the URI to connect the CockroachDB.
Now inside this console we need to enable a rangefeed,
SET CLUSTER SETTING kv.rangefeed.enabled = true;
Code in golang
Given that CockroachDB is compatible with Postgresql, I can still use gorm for the database connection. Here is a snippet of the function that use CockroachDB CDC to obtain the logs, on every change in the events table. Of course this will hang waiting for new logs.
file: repositories/event/repository.go
package event
// ChangeFeed fetch logs from events feed.
func (r *repository) ChangeFeed(ctx context.Context, pubsub msgrelay.PubSub) error {
rows, err := r.db.Raw("EXPERIMENTAL CHANGEFEED FOR events").Rows()
if err != nil {
return err
}
defer rows.Close()
topic := pubsub.Topic("cats")
var (
table string
key string
value []byte
)
for rows.Next() {
if err := rows.Scan(&table, &key, &value); err != nil {
return err
}
pubsub.Publish(ctx, topic, value)
}
return nil
}
The query that will allow you to do so would be EXPERIMENTAL CHANGEFEED FOR <table_name>
, this is in the core tier. For the enterprise, it becomes
ridiculously simple, you don’t have to handle the integration with the broker in code. Literally it follows a format like this:
CREATE CHANGEFEED FOR TABLE table_name, table_name2 INTO '{scheme}://{host}:{port}?{query_parameters}';
Conclusion
With all this said I think that you can hack around yourself and find a way to implement this pattern with all this information. I will leave you some links bellow so you can check them, because you will need them. This was just a quite short exposition, not a full detailed one.
Of course you can check the complete code in this repository.
NOTE: The code could be better.
Bibliography
- Transaction Log Tailing, the most important one, to first understand the idea of the pattern.
- CockroachDB CDC.
- EXPERIMENTAL CHANGEFEED FOR command.
- Install CockroachDB on Linux.
- Changefeed to Google Cloud Pub/Sub.