In this post I am going to quickly cover what is needed to get Curator up and running on the ELK stack. In the last few posts about the ELK stack I covered everything needed to get it installed, configured and ingesting logs reliably. If you missed those posts, you can find them here:
ELK 5 on Ubuntu 16.04
Once the ELK stack is running for a bit, you will likely notice that disk space begins to disappear quickly as you begin storing more data for longer periods of time. If you’re anything like me the last thing you need is another manual task to have to remember to perform like logging in and clearing out old data, of course you also want to have some level of consistency here. Well it turns out Elastic already has this covered with Curator. With Curator you define an actions file telling it which indice to clean up and how many days worth of data to retain. For example on the Winlogbeat indices, I can tell Curator delete any indice older than 60 days and then schedule Curator to run as a cron job once a week and that’s it. Curator will then maintain 60 days worth of logs on the instance.
Installing Curator
There are a few different ways to install Curator but the Python pip way generally seems to be the easiest, so that is what I am going to cover here on Ubuntu 16.04.
You can find more info on Curators official page here:
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html
1.) Make sure you have Python pip installed:
$ sudo apt-get install python-pip
2.) Install Curator:
$ sudo pip install elasticsearch-curator
3.) Create a new directory to store the configuration files in and let’s also create the Curator Config file:
$ mkdir Curator
$ cd Curator
$ nano Curator-Config.yml
Click here to download both configuration files (Curator.zip – 2KB) or you can just copy and paste the following:
# Remember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" client: hosts: - 127.0.0.1 port: 9200 url_prefix: use_ssl: False certificate: client_cert: client_key: ssl_no_validate: False http_auth: timeout: 30 master_only: False logging: loglevel: INFO logfile: logformat: default blacklist: ['elasticsearch', 'urllib3']
4.) Now we need to create an Actions File, this is where you define what data to delete and after how long:
$ nano Actions-File.yml
Click here to download both configuration files (Curator.zip – 2KB) or you can just copy and paste the following:
--- # Remember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" # # Also remember that all examples have 'disable_action' set to True. If you # want to use this action as a template, be sure to set this to False after # copying it. actions: 1: action: delete_indices description: >- Delete indices older than 60 days (based on index name), for winlogbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: prefix value: winlogbeat- exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 28 exclude: 2: action: delete_indices description: >- Delete indices older than 60 days (based on index name), for filebeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: prefix value: filebeat- exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 28 exclude: 3: action: delete_indices description: >- Delete indices older than 60 days (based on index name), for packetbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: prefix value: packetbeat- exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 28 exclude: 4: action: delete_indices description: >- Delete indices older than 60 days (based on index name), for metricbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: prefix value: metricbeat- exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 28 exclude:
In this configuration, I have an entry for each beats agents (Winlogbeat, Filebeat, Packetbeat, Metricbeat) since the data from each one is being stored in its own indice. You will also want to pay attention to the timestring! In this case, it is set to %Y.%m.%d but it will need to match the pattern set on the indices. If you followed my ELK Stack โ Tips, Tricks and Troubleshooting post where I made some changes to limit the number of shards created, the timestring pattern you will want to use is:
timestring: '%Y.%W'
The time string is what will be referenced in determining how old the logs are in order to delete them. Lastly the unit_count will be the number of days worth of data that we would like to retain, I have set it to 28 days here.
If you are not sure what pattern the indices are using, you can easily check by taking a look at the shards:
rob@LinELK01:~$ curl -XGET 'localhost:9200/_cat/shards?pretty' winlogbeat-2017.05.05 3 p STARTED 69 290.4kb 127.0.0.1 AjnlVN6 winlogbeat-2017.05.05 2 p STARTED 69 256kb 127.0.0.1 AjnlVN6 winlogbeat-2017.05.05 4 p STARTED 66 210.4kb 127.0.0.1 AjnlVN6 winlogbeat-2017.05.05 1 p STARTED 83 258.4kb 127.0.0.1 AjnlVN6 winlogbeat-2017.05.05 0 p STARTED 51 151.5kb 127.0.0.1 AjnlVN6 winlogbeat-2017.04.15 3 p STARTED 63 240.3kb 127.0.0.1 AjnlVN6 winlogbeat-2017.04.15 1 p STARTED 64 186.7kb 127.0.0.1 AjnlVN6 winlogbeat-2017.04.15 4 p STARTED 80 221.7kb 127.0.0.1 AjnlVN6 winlogbeat-2017.04.15 2 p STARTED 73 211.4kb 127.0.0.1 AjnlVN6 winlogbeat-2017.04.15 0 p STARTED 67 199.9kb 127.0.0.1 AjnlVN6 filebeat-2017.04.13 3 p STARTED 757 239.6kb 127.0.0.1 AjnlVN6 filebeat-2017.04.13 2 p STARTED 762 233kb 127.0.0.1 AjnlVN6 filebeat-2017.04.13 1 p STARTED 765 249.6kb 127.0.0.1 AjnlVN6 filebeat-2017.04.13 4 p STARTED 779 205.1kb 127.0.0.1 AjnlVN6 filebeat-2017.04.13 0 p STARTED 777 200kb 127.0.0.1 AjnlVN6 filebeat-2017.11.09 3 p STARTED 26 20.3kb 127.0.0.1 AjnlVN6
Here we can see that the indices are following the %Y.%m.%d format and the way the actions file is configured should match on these indices just fine.
More info on the data patterns can be found here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html#built-in-date-formats
5.) Time to put it all together and perform a “dry run”. The following command will run Curator and output what files it would delete without actually deleting anything:
$ sudo curator –config /home/rob/Curator/Curator-Config.yml –dry-run /home/rob/Curator/Actions-File.yml
You should see something similar to the following:
rob@LinELK01:~$ sudo curator --config /home/rob/Curator/Curator-Config.yml --dry-run /home/rob/Curator/Actions-File.yml [sudo] password for rob: 2017-11-12 20:04:26,367 INFO Preparing Action ID: 1, "delete_indices" 2017-11-12 20:04:26,374 INFO Trying Action ID: 1, "delete_indices": Delete indices older than 60 days (based on index name), for winlogbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:04:26,669 INFO DRY-RUN MODE. No changes will be made. 2017-11-12 20:04:26,669 INFO (CLOSED) indices may be shown that may not be acted on by action "delete_indices". 2017-11-12 20:04:26,669 INFO DRY-RUN: delete_indices: winlogbeat-2017.02.22 with arguments: {} 2017-11-12 20:04:26,669 INFO DRY-RUN: delete_indices: winlogbeat-2017.03.17 with arguments: {} 2017-11-12 20:04:26,669 INFO DRY-RUN: delete_indices: winlogbeat-2017.03.31 with arguments: {} 2017-11-12 20:04:26,670 INFO DRY-RUN: delete_indices: winlogbeat-2017.04.15 with arguments: {} 2017-11-12 20:04:26,670 INFO DRY-RUN: delete_indices: winlogbeat-2017.05.05 with arguments: {} 2017-11-12 20:04:26,670 INFO DRY-RUN: delete_indices: winlogbeat-2017.05.10 with arguments: {} 2017-11-12 20:04:26,670 INFO DRY-RUN: delete_indices: winlogbeat-2017.05.11 with arguments: {} 2017-11-12 20:04:26,671 INFO Action ID: 1, "delete_indices" completed. 2017-11-12 20:04:26,671 INFO Preparing Action ID: 2, "delete_indices" 2017-11-12 20:04:26,677 INFO Trying Action ID: 2, "delete_indices": Delete indices older than 60 days (based on index name), for filebeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:04:26,763 INFO DRY-RUN MODE. No changes will be made. 2017-11-12 20:04:26,763 INFO (CLOSED) indices may be shown that may not be acted on by action "delete_indices". 2017-11-12 20:04:26,763 INFO DRY-RUN: delete_indices: filebeat-2017.03.31 with arguments: {} 2017-11-12 20:04:26,763 INFO DRY-RUN: delete_indices: filebeat-2017.04.01 with arguments: {} 2017-11-12 20:04:26,764 INFO DRY-RUN: delete_indices: filebeat-2017.04.09 with arguments: {} 2017-11-12 20:04:26,764 INFO DRY-RUN: delete_indices: filebeat-2017.04.13 with arguments: {} 2017-11-12 20:04:26,764 INFO DRY-RUN: delete_indices: filebeat-2017.04.14 with arguments: {} 2017-11-12 20:04:26,764 INFO DRY-RUN: delete_indices: filebeat-2017.05.05 with arguments: {} 2017-11-12 20:04:26,764 INFO DRY-RUN: delete_indices: filebeat-2017.05.09 with arguments: {} 2017-11-12 20:04:26,764 INFO DRY-RUN: delete_indices: filebeat-2017.05.11 with arguments: {} 2017-11-12 20:04:26,765 INFO Action ID: 2, "delete_indices" completed. 2017-11-12 20:04:26,765 INFO Preparing Action ID: 3, "delete_indices" 2017-11-12 20:04:26,769 INFO Trying Action ID: 3, "delete_indices": Delete indices older than 60 days (based on index name), for packetbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:04:26,863 INFO Skipping action "delete_indices" due to empty list:2017-11-12 20:04:26,863 INFO Action ID: 3, "delete_indices" completed. 2017-11-12 20:04:26,863 INFO Preparing Action ID: 4, "delete_indices" 2017-11-12 20:04:26,867 INFO Trying Action ID: 4, "delete_indices": Delete indices older than 60 days (based on index name), for metricbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:04:26,999 INFO Skipping action "delete_indices" due to empty list: 2017-11-12 20:04:26,999 INFO Action ID: 4, "delete_indices" completed. 2017-11-12 20:04:26,999 INFO Job completed.
We see there are few indices that are old enough and should be deleted like the following indice:
2017-11-12 20:04:26,670 INFO DRY-RUN: delete_indices: winlogbeat-2017.05.11 with arguments: {}
6.) Now that we know the configuration is good, this time we’ll run the same command but without the –dry-run:
$ sudo curator –config /home/rob/Curator/Curator-Config.yml /home/rob/Curator/Actions-File.yml
rob@LinELK01:~$ sudo curator --config /home/rob/Curator/Curator-Config.yml /home/rob/Curator/Actions-File.yml 2017-11-12 20:12:12,315 INFO Preparing Action ID: 1, "delete_indices" 2017-11-12 20:12:12,327 INFO Trying Action ID: 1, "delete_indices": Delete indices older than 60 days (based on index name), for winlogbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:12:12,536 INFO Deleting selected indices: [u'winlogbeat-2017.02.22', u'winlogbeat-2017.03.31', u'winlogbeat-2017.05.05', u'winlogbeat-2017.03.17', u'winlogbeat-2017.04.15', u'winlogbeat-2017.05.10', u'winlogbeat-2017.05.11'] 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.02.22 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.03.31 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.05.05 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.03.17 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.04.15 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.05.10 2017-11-12 20:12:12,536 INFO ---deleting index winlogbeat-2017.05.11 2017-11-12 20:12:13,046 INFO Action ID: 1, "delete_indices" completed. 2017-11-12 20:12:13,047 INFO Preparing Action ID: 2, "delete_indices" 2017-11-12 20:12:13,051 INFO Trying Action ID: 2, "delete_indices": Delete indices older than 60 days (based on index name), for filebeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:12:13,123 INFO Deleting selected indices: [u'filebeat-2017.04.13', u'filebeat-2017.04.01', u'filebeat-2017.05.05', u'filebeat-2017.04.09', u'filebeat-2017.05.11', u'filebeat-2017.03.31', u'filebeat-2017.04.14', u'filebeat-2017.05.09'] 2017-11-12 20:12:13,123 INFO ---deleting index filebeat-2017.04.13 2017-11-12 20:12:13,123 INFO ---deleting index filebeat-2017.04.01 2017-11-12 20:12:13,123 INFO ---deleting index filebeat-2017.05.05 2017-11-12 20:12:13,123 INFO ---deleting index filebeat-2017.04.09 2017-11-12 20:12:13,124 INFO ---deleting index filebeat-2017.05.11 2017-11-12 20:12:13,124 INFO ---deleting index filebeat-2017.03.31 2017-11-12 20:12:13,124 INFO ---deleting index filebeat-2017.04.14 2017-11-12 20:12:13,124 INFO ---deleting index filebeat-2017.05.09 2017-11-12 20:12:13,607 INFO Action ID: 2, "delete_indices" completed. 2017-11-12 20:12:13,607 INFO Preparing Action ID: 3, "delete_indices" 2017-11-12 20:12:13,611 INFO Trying Action ID: 3, "delete_indices": Delete indices older than 60 days (based on index name), for packetbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:12:13,655 INFO Skipping action "delete_indices" due to empty list:2017-11-12 20:12:13,655 INFO Action ID: 3, "delete_indices" completed. 2017-11-12 20:12:13,655 INFO Preparing Action ID: 4, "delete_indices" 2017-11-12 20:12:13,660 INFO Trying Action ID: 4, "delete_indices": Delete indices older than 60 days (based on index name), for metricbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. 2017-11-12 20:12:13,712 INFO Skipping action "delete_indices" due to empty list: 2017-11-12 20:12:13,713 INFO Action ID: 4, "delete_indices" completed. 2017-11-12 20:12:13,713 INFO Job completed.
7.) Now recheck the shards and verify that there are not any indices older than what is defined in the Actions-File (28 days):
rob@LinELK01:~$ curl -XGET 'localhost:9200/_cat/shards?pretty' filebeat-2017.11.07 3 p STARTED 463 169kb 127.0.0.1 AjnlVN6 filebeat-2017.11.07 4 p STARTED 436 128kb 127.0.0.1 AjnlVN6 filebeat-2017.11.07 1 p STARTED 447 145kb 127.0.0.1 AjnlVN6 filebeat-2017.11.07 2 p STARTED 452 171.6kb 127.0.0.1 AjnlVN6 filebeat-2017.11.07 0 p STARTED 440 147.6kb 127.0.0.1 AjnlVN6 filebeat-2017.11.09 3 p STARTED 26 20.3kb 127.0.0.1 AjnlVN6 filebeat-2017.11.09 1 p STARTED 21 47.2kb 127.0.0.1 AjnlVN6 filebeat-2017.11.09 4 p STARTED 15 28.4kb 127.0.0.1 AjnlVN6 filebeat-2017.11.09 2 p STARTED 23 19.4kb 127.0.0.1 AjnlVN6 filebeat-2017.11.09 0 p STARTED 26 20.5kb 127.0.0.1 AjnlVN6 .kibana 0 p STARTED 3 60.3kb 127.0.0.1 AjnlVN6 winlogbeat-2017.45 3 p STARTED 102 206.2kb 127.0.0.1 AjnlVN6 winlogbeat-2017.45 2 p STARTED 80 166.3kb 127.0.0.1 AjnlVN6 winlogbeat-2017.45 4 p STARTED 110 284.8kb 127.0.0.1 AjnlVN6 winlogbeat-2017.45 1 p STARTED 90 289.3kb 127.0.0.1 AjnlVN6 winlogbeat-2017.45 0 p STARTED 121 289.6kb 127.0.0.1 AjnlVN6 winlogbeat-2017.11.07 3 p STARTED 62 197.7kb 127.0.0.1 AjnlVN6 winlogbeat-2017.11.07 2 p STARTED 50 257.8kb 127.0.0.1 AjnlVN6 winlogbeat-2017.11.07 4 p STARTED 52 292kb 127.0.0.1 AjnlVN6 winlogbeat-2017.11.07 1 p STARTED 56 147.6kb 127.0.0.1 AjnlVN6 winlogbeat-2017.11.07 0 p STARTED 56 158.6kb 127.0.0.1 AjnlVN6 filebeat-2017.11.10 3 p STARTED 25 32.2kb 127.0.0.1 AjnlVN6 filebeat-2017.11.10 1 p STARTED 22 40.9kb 127.0.0.1 AjnlVN6 filebeat-2017.11.10 4 p STARTED 26 25.3kb 127.0.0.1 AjnlVN6 filebeat-2017.11.10 2 p STARTED 25 59.7kb 127.0.0.1 AjnlVN6 filebeat-2017.11.10 0 p STARTED 17 29.5kb 127.0.0.1 AjnlVN6 filebeat-2017.11.11 3 p STARTED 24 58.6kb 127.0.0.1 AjnlVN6 filebeat-2017.11.11 1 p STARTED 24 25.9kb 127.0.0.1 AjnlVN6 filebeat-2017.11.11 4 p STARTED 29 32.7kb 127.0.0.1 AjnlVN6 filebeat-2017.11.11 2 p STARTED 21 57.7kb 127.0.0.1 AjnlVN6 filebeat-2017.11.11 0 p STARTED 21 13kb 127.0.0.1 AjnlVN6 winlogbeat-2017.46 3 p STARTED 0 130b 127.0.0.1 AjnlVN6 winlogbeat-2017.46 1 p STARTED 1 15.5kb 127.0.0.1 AjnlVN6 winlogbeat-2017.46 4 p STARTED 2 32kb 127.0.0.1 AjnlVN6 winlogbeat-2017.46 2 p STARTED 1 15.5kb 127.0.0.1 AjnlVN6 winlogbeat-2017.46 0 p STARTED 0 130b 127.0.0.1 AjnlVN6 filebeat-2017.11.06 3 p STARTED 376 155.9kb 127.0.0.1 AjnlVN6 filebeat-2017.11.06 2 p STARTED 386 123.1kb 127.0.0.1 AjnlVN6 filebeat-2017.11.06 4 p STARTED 385 149.9kb 127.0.0.1 AjnlVN6 filebeat-2017.11.06 1 p STARTED 394 114.3kb 127.0.0.1 AjnlVN6 filebeat-2017.11.06 0 p STARTED 424 158kb 127.0.0.1 AjnlVN6 filebeat-2017.11.12 3 p STARTED 23 19.7kb 127.0.0.1 AjnlVN6 filebeat-2017.11.12 1 p STARTED 25 37kb 127.0.0.1 AjnlVN6 filebeat-2017.11.12 4 p STARTED 34 63.3kb 127.0.0.1 AjnlVN6 filebeat-2017.11.12 2 p STARTED 26 32.1kb 127.0.0.1 AjnlVN6 filebeat-2017.11.12 0 p STARTED 29 21kb 127.0.0.1 AjnlVN6 filebeat-2017.11.08 3 p STARTED 35 18.3kb 127.0.0.1 AjnlVN6 filebeat-2017.11.08 1 p STARTED 34 39.9kb 127.0.0.1 AjnlVN6 filebeat-2017.11.08 4 p STARTED 25 42.9kb 127.0.0.1 AjnlVN6 filebeat-2017.11.08 2 p STARTED 35 28.3kb 127.0.0.1 AjnlVN6 filebeat-2017.11.08 0 p STARTED 42 59kb 127.0.0.1 AjnlVN6 rob@LinELK01:~$
Awesome, everything appears to work as expected and we just freed up all of that disk space!
Note: You may have noticed there are some gaps in the indices referenced above as well as different timestamp patterns used on some of the them, that is just because this is a demo box that has be on and off over the last few months and had a few config changes.
8.) Now to automate this task just set up a simple cron job:
$ crontab -e
And then add the same Curator command used above with the addition of the timing, the following line will run Curator everyday at midnight:
0 0 * * * sudo curator --config /home/rob/Curator/Curator-Config.yml /home/rob/Curator/Actions-File.yml
And that is it, the ELK stack will now take care of itself!