Remove field/tag column from InfluxDb
At the time of writing it is, out of the box, not possible to easily drop a field from a bucket, albeit being a quite popular request. Fortunately there is an easy way to do so in Influxdb 2.
Assume you by accident included a field/tag (or duplicated measurements) and want to clean it up. If you want to delete whole measurement you can use the build in
delete
API with a predicate.
Set-up
Have a OriginBucket
you want to filter data from, create a temporary bucket BackupBucket
. The data will be filtered and stored in the backup bucket, the origin will be wiped and restored from the backup.
If the bucket has a lot of data (and/or slow server) this operation might take a while and all incoming data while this operation is ongoing will not be taken into account.
Solution
Create a new task to copy all the to be kept data from the origin to backup bucket. Make sure to adjust the parameters (origin & destination bucket, organization and time range). After the range statement perform any filtering you want to do, e.g.
|> drop(columns: ["host"])
to drop the host column from all measurements.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
{ "meta": { "version": "1", "type": "task", "name": "Create filtered bucket backup-Template", "description": "template created from task: Create filtered bucket backup" }, "content": { "data": { "type": "task", "attributes": { "status": "inactive", "name": "Create filtered bucket backup", "flux": "option task = {name: \"Create filtered bucket backup\", every: 30d}\n\nfrom(bucket: \"OriginBucket\")\n\t|> range(start: -5y)\n\t|> to(bucket: \"BackupBucket\", org: \"myorg\")", "every": "30d" }, "relationships": { "label": { "data": [] } } }, "included": [] }, "labels": [] }
Start the backup task manually (e.g. through the influx web UI).
Check all desired data is present in the backup bucket (e.g. through the influx web UI, data explorer).
Remove the data from the original bucket (the is no way back after this), through the influx CLI:
influx delete --bucket OriginBucket --start '1970-01-01T00:00:00Z' --stop $(date +"%Y-%m-%dT%H:%M:%SZ") -t [OriginBucket token] --org [MyOrg]
(substitute the token and org are leave them out and use the default configuration).Add the restore task. Make sure to adjust the parameters (origin & destination bucket, organization and time range).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
{ "meta": { "version": "1", "type": "task", "name": "Restore bucket backup-Template", "description": "template created from task: Restore bucket backup" }, "content": { "data": { "type": "task", "attributes": { "status": "inactive", "name": "Restore bucket backup", "flux": "option task = {name: \"Restore bucket backup\", every: 30d}\n\nfrom(bucket: \"BackupBucket\")\n\t|> range(start: -5y)\n\t|> to(bucket: \"OriginBucket\", org: \"myorg\")", "every": "30d" }, "relationships": { "label": { "data": [] } } }, "included": [] }, "labels": [] }
Run the restore task manually (e.g. through the influx web UI).
That’s it. Afterwards both tasks, and the intermediate bucket, can be removed.
- Permalink: //oostens.me/posts/remove-field/tag-column-from-influxdb/
- License: The text and content is licensed under CC BY-NC-SA 4.0. All source code I wrote on this page is licensed under The Unlicense; do as you please, I'm not liable nor provide warranty.