Jump to content

/var/log file fills up to 100% using pihole


ryadre

Recommended Posts

Orange pi zero running Pihole. The web interface stopped updating and I found the /var/log file was 100%. Only way I could get it working again was to delete some of the files . How do I stop this filling up again please? From what i read it may be a armbian issue more than a pihole?

Link to comment
Share on other sites

22 hours ago, ryadre said:

How do I stop this filling up again please?

 

Open a github issue and request sane (not verbose as hell) logging by default. But be prepared that this might take ages even for trivial changes (example)

 

Moving the log file around is not an option if you run off SD card since the card will die hundred times faster. So ask them to decrease their log verbosity or at least to implement daily log rotation with compression applied.

Link to comment
Share on other sites

Default pihole installation don't presents this issue.

 

This situation after 24h use (standard log verbose):

Spoiler

francesco@orangepizero:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            212M     0  212M   0% /dev
tmpfs            50M  4,6M   45M  10% /run
/dev/mmcblk0p1   15G  1,6G   13G  11% /
tmpfs           247M   16K  247M   1% /dev/shm
tmpfs           5,0M     0  5,0M   0% /run/lock
tmpfs           247M     0  247M   0% /sys/fs/cgroup
tmpfs           247M     0  247M   0% /tmp
log2ram          50M  3,7M   47M   8% /var/log
/dev/sda1        29G  1,1G   28G   4% /mnt/USB32
tmpfs            50M     0   50M   0% /run/user/999
tmpfs            50M     0   50M   0% /run/user/1000

 

On OPIZ v1.4, Armbian Stretch 5.41, pihole 3.21

Link to comment
Share on other sites

Today my /var/log was also full.

pihole.log was about 31MB and so cant logrotated.

There is a "problem" when pi-hole get too much requests, then the 50MB log2ram isnt enough.
Mostly then there is a client which sends much to moch requests:
https://discourse.pi-hole.net/t/problem-with-armbian-and-log2ram-var-log-fills-up/6197/11

If have identified my "wild client" as my Google Home.

Here in Tuerkey tunein.com and cast.tunein.com are blocked (dont know why),

but if the Google Home once has been instructed to play a TuneIn-channel and cant it here he send

the whole day DNS-request to this domains :(

 

So the last day I did found 72.000 requests - thats why my /var/log is full ands cant be logrotated.

 

Additionally you could check the size of the long term DB:

DBFILE=/etc/pihole/pihole-FTL.db

 

Informations about setting some DB configuration is at:

https://github.com/pi-hole/FTL#ftls-config-file

 

The file /etc/pihole/pihole-FTL.conf has to be created - I will try my luck with the follwoing entrys:

MAXDBDAYS=7
DBINTERVAL=5.0
DBFILE=/etc/pihole/pihole-FTL.db
MAXLOGAGE=24.0

 

Only 7 days for the DB, writing to the DB every 5 minutes (not every 1.0 minutes), DB is the normal location,

MaxlogAge in Ram is also the normal 24 Hours.

 

I have now blacklisted tunein.com and cast.tunein.com - hope this will shrink the log.

 

BTW: because Pi-Hole is the only App running on my OPi One I will increase the size of log2ram, which is configured in the file /etc/default/log2ram :

# size of the tmpfs mount
SIZE=100M

 

but armbian doesnt seem to occupy this log2ram size at bootup....
 

 

 

Link to comment
Share on other sites

72.000 request (in 24h?) from a single domain is an abnormal situation.

My log2ram (var/log) is always between 5 and 10%.

 

This after 6 day pihole filtering (standard list + nocoin, standard logs, no blacklisting, 6 devices):

 

Schermata del 2018-02-17 14-22-10.png

Link to comment
Share on other sites

4 hours ago, Moklev said:

72.000 request (in 24h?) from a single domain is an abnormal situation.

My log2ram (var/log) is always between 5 and 10%.

This after 6 day pihole filtering (standard list + nocoin, standard logs, no blacklisting, 6 devices):

Now the client with 72.000 Request is blocked (tunein.com) , but my Raspberry Pi "MyCroft" AI does 4.000 Request since 13:00 a clock:

Low_Network_Requests.jpg

Link to comment
Share on other sites

I have a similar issue. I have pi.hole in a cubox-i where I also have a plex server. Yesterday I setup the plex server for music and I think that create too much request so it fill up the var/log really quick until it was totally full. 

So I increase the side of the partition from 50mb to 150mb. But I don't know if this is going to fix the issue in the near future. 

I'm quite new to linux and of course armbiam. 

You've been talking about logrotating the logs, but I don't know if I have that set up or not. Where can I get more information about that? 

Thanks! 

Link to comment
Share on other sites

My take on this is that pihole today just doesn't play nicely with low volume mounts like a log2ram tmpfs. It can work but only at low scale. dnsmasq is an unrestricted firehose of data into the pihole.log file at the moment.

 

@tkaiser is correct that pihole guys do need to do something about this but my feeling is this is not a priority for them. I happened to pick up on one of the suggestions in their discourse forum that requested the ability to filter out unwanted log entries here: https://discourse.pi-hole.net/t/filter-out-sites-from-query-log/3843.  If implemented that would go some way towards helping with this problem.

 

IMO what they need to do is make the logging system more flexible so it can log anywhere. I suppose with a manual tweak to /etc/dnsmasq.d/01-pihole.conf that could be done now to revert to syslog logging.

 

I think the options today are do the manual dnsmasq reconfiguration, point syslog to some other box with HDD storage and use some other UI to view what the pihole web admin gui provides (i.e. ELK stack), tread carefully with how you use pihole so it won't fill the file system or just switch off their log completely and live with no UI. I guess another option is to do what @lpuerto did, switch off log2ram and let it rip but that's not for me.

Link to comment
Share on other sites

26 minutes ago, Joe said:

@tkaiser is correct that pihole guys do need to do something about this but my feeling is this is not a priority for them.

 

Sorry, but why is pihole called pihole? Didn't they start on the Raspberry Pi where 'normal storage' is defined as SD card? I'm entirely done with this software after the last try to get a minor security fix applied which took 4 months so I won't look into their code ever again.

 

But if they do not use syslog (so you could simply define your own filter rules and send all the useless crap to /dev/null) and write a logfile directly you should check whether they constantly issue sync calls or not. If the latter then their log contents ends up being bufferend in RAM and will only be written to card every ten minutes which would greatly reduce write amplification and wear on SD card. An explanation here and also how to check for: https://forum.armbian.com/topic/6635-learning-from-dietpi/?do=findComment&comment=50489

 

Link to comment
Share on other sites

pihole relies on dnsmasq to log to the /var/log/pihole.log. They extend dnsmasq configuration like this: https://github.com/pi-hole/pi-hole/blob/master/advanced/01-pihole.conf#L43. I completely understand if you don't want to go there and look at that btw, but they basically instruct dnsmasq to not use syslog and to write to the pihole.log file via the `log-facility` option (https://linux.die.net/man/8/dnsmasq)

 

It's really nice that armbian configures the commit=600 cache flush mechanism to the SD card but I'm afraid I don't know if dnsmasq will respect this

 

In terms of dealing with the problem in pihole / their default configuration, to me it seems altogether safer and more manageable to manually reconfigure dnsmasq in their 01-pihole.conf back to use syslog again, regain control of their firehose and offload all logging onto to some other device with HDD storage.

Link to comment
Share on other sites

Sorry for my ignorance here, I'm not an IT expert, just an aficionado. 

 

I fail to catch all the drama about this and why can't the logs be store directly in the SD card. I understand that logs can't grow up that much that finally clutter the system storage and as far as I understand for that was created logrotate. 

 

PS/ Disabling the log2ram in case seems to be fine. Right now the total log volume is just 21M. I guess logrotate worked and compressed the logs. 

Edited by lpuerto
expanding the info.
Link to comment
Share on other sites

14 minutes ago, Joe said:

It's really nice that armbian configures the commit=600 cache flush mechanism to the SD card but I'm afraid I don't know if dnsmasq will respect this

Check the PID of dnsmasq and let iotop report what's going on? You need to differentiate between 'total' and 'actual' writes as per iotop's manual page (one is filesystem activity and only the other is relevant since that's about writing to the layer below)

 

8 minutes ago, lpuerto said:

I fail to catch all the drama about this and why can't the logs be store directly in the SD card

Again: https://forum.armbian.com/topic/6635-learning-from-dietpi/?do=findComment&comment=50489

 

Link to comment
Share on other sites

On 27/2/2018 at 3:20 PM, t.munzer said:

I'm curious, does log2ram works differently with mainline kernel ? My pi-Hole setup on Orange PI PC and Sunvell R69 hungs every 2/3 hours/days with legacy kernel, but runs smoothly for weeks along Domoticz with a mainline kernel. 

 

Simply... a old and patchy kernel (3.x) vs a stable and modern one (4.14.xx).
My OPZ mainline runs for weeks too with pihole and motioneye.

Link to comment
Share on other sites

10 hours ago, tkaiser said:

Check the PID of dnsmasq and let iotop report what's going on? You need to differentiate between 'total' and 'actual' writes as per iotop's manual page (one is filesystem activity and only the other is relevant since that's about writing to the layer below)

 

Thanks for the iotop tip. I ran some tests putting pihole / dnsmasq under load and from what I can see, "Actual DISK WRITE" intermittently shows activity whereas the dnsmasq process "DISK WRITE" shows much more activity. This gives weight to the theory that dnsmasq is not circumventing / bypassing the commit=600 cache flush write mechanism.

 

With this known, it's not as bad as I originally anticipated to switch off log2ram for pihole purposes.. but I still won't be doing here :)

 

There is a glimmer of hope for pihole users wanting to keep log2ram on although it's still some way off. I read that they've announced something called FTLDNS which is supposed to remove the need for a log file when it becomes active in pihole 4.0

 

 

Link to comment
Share on other sites

12 hours ago, Joe said:

This gives weight to the theory that dnsmasq is not circumventing / bypassing the commit=600 cache flush write mechanism.

 

No need to construct theories or to believe in anything. These boards are neither black boxes nor do they run something magic no one is able to understand. They run an OS with a kernel that exposes every information needed. It's usually just some knowledge missing how to access this information. Quoting myself from the aforementioned thread (I referenced the thread for a reason -- since it's time to talk about how processes access the flash layer on SD cards and eMMC and to fight against some 'common SD card knowledge' that is totally misleading in reality):

Quote

Grab 'iostat 5' and 'iotop -o -b -d5 -q -t -k | grep -v Total' and start to analyse what's happening at the block device and application layer forgetting about the filesystem layer in between!)

Simply open up two terminal windows, run both commands and let pi-hole/dnsmasq do their job. If you see real activity at the block device layer every few seconds you have to worry, if this happens only every few minutes you're fine. The amount of wear-out on your SD card due to drastically lower write amplification is magnitudes lower if dnsmasq doesn't try to flush its log to disk/card every now and then.

 

If I understood you right, pi-hole does need this log only for parsing information out of it and storing that into a database? If that's really the case you should also focus on how database contents are updated for the very same reason as above: constant writes to card every few seconds lead to a write amplification that can be hundred or even thousand times higher compared to combined writes only every few minutes...

Link to comment
Share on other sites

On 08/03/2018 at 10:13 AM, tkaiser said:

Check the PID of dnsmasq and let iotop report what's going on? You need to differentiate between 'total' and 'actual' writes as per iotop's manual page (one is filesystem activity and only the other is relevant since that's about writing to the layer below)

 

Again: https://forum.armbian.com/topic/6635-learning-from-dietpi/?do=findComment&comment=50489

 

Thanks for the reply! 

Link to comment
Share on other sites

On 3/9/2018 at 8:13 AM, tkaiser said:

Grab 'iostat 5' and 'iotop -o -b -d5 -q -t -k | grep -v Total' and start to analyse what's happening at the block device and application layer forgetting about the filesystem layer in between!)

 

Simply open up two terminal windows, run both commands and let pi-hole/dnsmasq do their job. If you see real activity at the block device layer every few seconds you have to worry, if this happens only every few minutes you're fine.

 

Thanks I'd already followed these from your previous post and this is what led me to a what was probably a poor choice of words when I said "theory".  It wasn't completely clear to me what I was looking at from iostat and iotop because the values are transient and not accumulative but I'm happy to say that dnsmasq looks fine, it's not circumventing / bypassing the commit=600 cache when writing to /var/log/pihole.log. 

 

pihole has two persistence stores today, the /var/log/pihole.log file which is mandatory if you want any data logged for the web UI at all and also a sqlite db which is intended for longer term storage and this can be switched off completely.

Link to comment
Share on other sites

1 hour ago, Joe said:

It wasn't completely clear to me what I was looking at from iostat and iotop because the values are transient and not accumulative

 

But that (monitoring in almost real time and not looking at accumulative values) is the only way it works. You're not interested in amount written at the filesystem or block device layer but at the flash layer below. Since that's the only layer that matters here and due to the nature of flash storage a lot is different here compared to spinning rust (HDDs). We're back at 'write amplification' again: https://en.wikipedia.org/wiki/Write_amplification

 

Overwriting one byte physically on a flash medium every second ends up in 3600 bytes changed after an hour at the filesystem layer.

 

At the block device layer below since filesystem structures are also affected this will be already a lot more since each time you flush filesystem contents to disk metadata like 'last access time' will also be updated (depends on the filesystem in question how efficient this is done)

 

At the flash layer below the amount of data written is magnitudes higher since flash can not overwrite data. If you want to change something flash storage devices have to reorganize the contents of 'pages'. As long as there are reserve or unused pages the data from the page to be modified gets read from the original page, this page then will be added to the pool of pages that are soon to be erased (organized in much larger Erase Blocks) and the contents of the original page plus the changes get written to another new page. Common page sizes are 8K or even 16K these days so 1 byte changed at the filesystem layer is 8192 or 16384 bytes at the flash layer. The amount of wear is roughly 8,000 or even 16,000 times higher when changes at the filesystem layer are always flushed immediately to SD card.

 

And that's what you have to look for with above mentioned tools: how often and in which intervals data gets really written at the block device layer since this will end up with the flash translation layer below writing at least one full new page.

 

Using the example above (Overwriting exactly one byte every second for a whole hour)  and now looking at filesystem commit intervals:

  1. Always directly flushing all contents to 'disk': for each byte written at the filesystem layer at least one whole page at the flash layer will be completetly written. Let's use 8K page size and we end up with 3600 bytes changed at the filesystem layer in 3600 * 8192 bytes --> 28 MByte at the flash layer
  2. Using filesystem default commit intervals, e.g. ext4 which uses 5 seconds. 3600 bytes at the filesystem layer result in (3600 / 5) * 8192 --> 5.6 MByte  at the flash layer
  3. Now let's take Armbian's commit interval of 600 seconds. The same 3600 bytes at the filesystem layer result in (3600 / 600) * 8192 --> 48 KByte  at the flash layer

Appending something to a file (e.g. one line in a logfile) is 'changing a block' at the filesystem layer. Filesystems are usually organized in small entities called blocks or chunks and when you create the filesystem you define also this block or chunk size. If you do not define this on your own the filesystem utilities choose defaults (usually depending on the size of the filesystem: larger fs --> larger block sizes). With ext4 you would check for this with

tune2fs -l /dev/mmcblk0p1 | grep '^Block size'

The number returned here (with Armbian and average size SD cards usually 1K) is what will be later changed at the block device layer when a sync call is issued. So a 50 byte new log entry when commited to 'disk' will end up with 1024 bytes at the filesystem layer and depending on how many of these filesystem blocks have to be flushed to disk at the same time to at least one page at the flash layer.

 

So with an application that always tries to flush new log contents immediately to disk you end up with changes happening every second at 28 MB written at the flash layer. When the application doesn't try to flush the contents to disk but lets the filesystem do its job, then the commit interval gets important. With the default 5 seconds we're now at 5.6 MB while with Armbian's 600 seconds it's hundred times less: just a few KB.

 

That's the most important part when dealing with flash storage: Understanding the effects of unnecessarily high write amplification and how to reduce this. Unfortunately this is both boring and complex at the same time and users usually prefer simple or even oversimplified BS answers to the question 'how can I prevent my SD card wearing out too fast?'

 

So it's great that your test confirmed that pi-hole/dnsmasq do not try to flush their log contents immediately to disk and affected users can simply switch log2ram off until we might come up with a better replacement for log2ram. With our 600 second commit interval amount of flash wear out is pretty minimal :) 

 

BTW: The above numbers talk about the rather optimistic assumption that only 8K per page have to be written. If the total amount of data written to the SD card at the flash layer exceeded already the card's native capacity (writing 8.1 GB to an 8GB SD card for example) then it might get much much worse since now since we can't make use of TRIM with SD cards and so no more free pages are available and the flash translation layer now always has to erase stuff first (affecting much larger Erase Blocks, their size is usually several MB compared to the small 8K/16K pages).

 

More recent quality cards do a pretty good job here. They use an overprovisioning area with a sane amount of reserve pages and do data reorganization in the background (wear leveling and garbage collection). The problem is with crappy, old and slow cards with small capacity. Those really do not deal good with this situation they run into also much faster due to their small capacity. That's why we try to educate our users constantly to throw those  old and crappy SD cards away and better invest a few bucks in a new and great performing SD card like those mentioned here: 

 

Link to comment
Share on other sites

I appreciate your thoughtful and detailed reply and agree this is a bit of a boring and complex area. I admit that I'm probably 80/20 here where I lean towards being cautious, do the right thing in terms of buying decent SD cards and test them for genuineness and performance, think about application risk areas e.g. pihole etc. In a way I'm probably over cautious where log2ram will remain enabled and I'll continue to look for ways where the goal is a readonly file system for the most part. At the same time I'd like for things to just reliably work but this is really an unrealistic dream and SD card applications do need careful consideration.

 

It might be boring and complex subject but you're doing a good job and staying interested. Please keep it up.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines