| systemdlete | So another VM locked up earlier today. Found this in the kern.log: https://dpaste.com/FBJGMHFUA | 04:34 |
|---|---|---|
| systemdlete | anyone take a gander at what causes this? The VM becomes unresponsive to any input, although I was able to get a clean shutdown by sending an ACPI signal to the VM via vbox's tools. | 04:34 |
| gnarface | systemdlete: /msg it to me or paste.debian.net? | 04:35 |
| systemdlete | Is there a problem using dpaste.com? | 04:35 |
| systemdlete | (just asking) | 04:35 |
| gnarface | trusting new domains is risky, nothing specific about that one | 04:36 |
| systemdlete | I didn't think dpaste.com was all that new, but OK. I'll post it on paste.debian.net | 04:36 |
| gnarface | s/new/additional/ | 04:37 |
| systemdlete | https://paste.debian.net/hidden/8962c79a/ | 04:37 |
| systemdlete | (pastebinit works again for me--YAY!) | 04:37 |
| * systemdlete had problems with pastebinit for a very long time. Just tried it now, after several years of avoiding it.) | 04:38 | |
| gnarface | hmm, well this is definitely above my head, all i've got is the vague impression that it's gotta be related to a flaw in vbox's internal graphics driver and/or KMS | 04:39 |
| systemdlete | what is KMS? | 04:39 |
| systemdlete | (I'll ask the vbox folk if they can make anything of this.) | 04:40 |
| gnarface | Kernel Mode Setting (the thing that hands your system console over to framebuffer handling, which is on by defauolt now) | 04:40 |
| systemdlete | btw, I should have mentioned: This log is from the VM, not the host | 04:40 |
| systemdlete | Let me see if the host noticed it also | 04:40 |
| gnarface | so i see drm_kms_helper, (drm = direct rendering manager) and vmwgfx going back and forth in this stack trace, so whatever went off the rails here seems to be to do with the graphical stack | 04:41 |
| gnarface | and uh, recalling from your past vbox tribulations, that seems to be a repeat offender | 04:42 |
| systemdlete | Host noticed nothing. | 04:42 |
| gnarface | maybe a good sign | 04:42 |
| systemdlete | in fact, there are no logs on host in kern.log from about 3am to past the time of the incident in the VM | 04:43 |
| gnarface | maybe whatever triggered the glitch out is some identifiable process inside that particular VM, and you can just avoid it somehow | 04:43 |
| systemdlete | OK, I will take this up with the vbox folk. Just thought it MIGHT be something to do with the env inside the VM | 04:43 |
| systemdlete | right | 04:43 |
| systemdlete | thanks | 04:43 |
| gnarface | no problem, good luck | 04:43 |
| systemdlete | If I truncate a file from cmd line thus: >somefile (not with echo, just the '>') while another process has the same file open for writing, I'd expect the running process to continue writing to the END of the file, not location 0. At least, that was how things used to work back in the day, and it caused lots of drama at times. Has this behavior changed? Or was I sleeping during kernel internals classes? | 06:11 |
| systemdlete | I have an instance of rsyslogd running that writes to a file, and I can see that it remains open persistently, even after I truncate that file from the command line, as above. | 06:12 |
| rwp | systemdlete, You are correct. Are you seeing something different? Is the writing process lseek()'ing when writing? | 06:12 |
| * systemdlete is elated that, at very least, his notions are right... | 06:12 | |
| systemdlete | thanks rwp | 06:12 |
| rwp | There is a subtle difference between programs that write log files opened for writing and those that open for append. | 06:13 |
| systemdlete | I'm not really sure what rsyslog is doing. I think it just opens the file for writing, probably for appending, since the output log might exist from a previous boot | 06:13 |
| systemdlete | ah! | 06:13 |
| rwp | Files opened for append always append and that is enforced by the kernel. Even if multiple programs are writing to the file. | 06:13 |
| systemdlete | ok... | 06:13 |
| rwp | Files opened for writing /may/ have a race condition problem if TWO or more processes are also writing the file at the same time. | 06:13 |
| systemdlete | Without examining rsyslogd's guts, I'm at a loss for what it is doing in every last case where it opens a file or writes to one | 06:13 |
| rwp | But if you are talking syslog writing to the file then that is only ever one process. If you ">syslog" the file to truncating it then future writes should just continue to show up in the file after that truncation. | 06:14 |
| systemdlete | In my case, only rsyslogd will be writing to it. My other task will be merely reading it and then wanting to truncate it, to spare the space | 06:14 |
| rwp | Just by way of practice if there is some log file that has filled up the disk, say it is TB in size and consumed all space, and I /think/ I have killed the daemon but out of paranoia might believe I have not killed the daemon then I always truncate the file first ">logfile" as you have done to truncate the file and to free the space, before removing the file. | 06:15 |
| systemdlete | I think of open file handles as their inodes. An inode has a "pointer" to where it will write next (or read next) | 06:16 |
| rwp | The file system is reference counted. The file is only ever actually removed when the last open has closed. If any program has an open file descriptor to it then the file is not freed. Which means I ahve repeatedly seen people remove very large logfiles and then wonder why they did not free up any space. Space was not freed up because some program was "tail -F logfile" or otherwise still having an open file handle to it. So I | 06:17 |
| rwp | always truncate before removing a large file. | 06:17 |
| systemdlete | I am going to guess, strongly, that rsyslogd opens log (output) files for append. That only makes sense for its function. | 06:17 |
| rwp | In addition to the inodes a program has file descriptors. A structure held in an array. Which is why they are numbered 0, 1, 2, ... N and so on. In the file descriptor structure is more information pointing to the inode. | 06:17 |
| systemdlete | right, true that | 06:18 |
| systemdlete | but I doubt much that there are multiple handles on any single logfile (output) | 06:18 |
| rwp | The difference between O_APPEND yes or no is only a difference if there are multiple programs writing. But you were asking about writing to the END of the file above so that had me talking about O_APPEND. | 06:18 |
| systemdlete | let's assume O_APPEND yes | 06:19 |
| rwp | So in the /var/log/syslog you truncate the file (why? was it too large consuming all space) and then the log file continued to be written to? Yes or no? Bring me into sync with you. | 06:19 |
| systemdlete | ok | 06:19 |
| onefang | Has someone mentioned logrotate yet? | 06:19 |
| systemdlete | sorry if I was not clear | 06:19 |
| rwp | Not yet (logrotate) but it is coming soon. | 06:20 |
| systemdlete | No, this is just me and my own concoction | 06:20 |
| systemdlete | this has NOTHING to do with logrotate. That works perfectly, so let's leave that topic out of this. | 06:20 |
| systemdlete | What I am doing: | 06:20 |
| systemdlete | I have rsyslogd listening for some messages from a remote sender. If the message matches a certain string (by tag) then it writes a message to a log file. | 06:21 |
| systemdlete | Now, what I WANT to do (unless maybe a better way exists), is for a DIFFERENT process to read that file, at intervals, and then truncate it, having completed its task. | 06:22 |
| systemdlete | The problem here is that I am trying to make rsyslogd do something it really wasn't ever meant to do, not in about a million years (or until the 64 bits run out) | 06:22 |
| systemdlete | I admit to being the sole culprit, if so | 06:22 |
| systemdlete | BUT | 06:23 |
| rwp | There is an intrinsic race condition written into your statement of a gap in time between reading and finished processing and then truncating when syslog may write a new line there after the reading and before the truncating and that line would be lost in that case. | 06:23 |
| systemdlete | What I do observe is that by truncating the log file, the log file remains open in rsyslogd and it continues to write messages to it, but from the truncation point (pos zero), not wherever its own internal control structures think it should | 06:24 |
| * rwp has never missed rich-communication-services typing notification as much as now... | 06:24 | |
| rwp | If you have truncated the file then that part of the file is now truncated and writing will continue at byte 0. What else would it do? Create a sparse file with a hole in the middle? Would that be useful? | 06:25 |
| systemdlete | yes, there is a chance of a race condition. But that still doesn't make it clear to me why I am observing this behavior. It will work just fine, but I think it is wrong. | 06:25 |
| systemdlete | Yeah, I'd expect something like a sparse file, but in the past, as I said, that created quite a bit of drama. | 06:26 |
| systemdlete | (I mean, at work, not here at home, and those were actual Unix systems, not Linux for the most part) | 06:26 |
| rwp | A sparse file in the syslog would be a problem. Because anything reading it would start at offset 0 and then start reading the hole data which is bit-zero bits until they get read up to the end of the hole. Meaning that grep and such would need to handle binary data. Not good. | 06:27 |
| systemdlete | oh yes, that would be! | 06:27 |
| systemdlete | I am not disagreeing with you in the slightest | 06:27 |
| systemdlete | So it sounds like you are saying that the kernel will "move" rsyslogd's internal write pointer to the NEW output location, namely location zero. Is that it? | 06:28 |
| rwp | Yes. That's correct. | 06:29 |
| systemdlete | TBH, I am inquiring about a behavior that will actaully work perfectly for my use case. But it still doesn't match my years of experience with open file pointers. | 06:30 |
| rwp | Now I am glad I mentioned O_APPEND before because now it is important. To create the sparse file one would need to truncate, then lseek() to the size wanted, then write something to create a hole between offset 0 and the new size of the file. And if opened without O_APPEND the original writer would probably write back at offset 0 and not at the end of the file. With O_APPEND it would track to the end of the file. | 06:30 |
| rwp | I am actually not 100% sure without trying it what would be the actual behavior. But it seems undesirable in either case. So I wouldn't do regardless of which way it works. | 06:31 |
| systemdlete | rwp: Not to addle you in any way, but is it just possible that Unices and Linux handle this somewhat differently. | 06:31 |
| systemdlete | Like I said, I am surprised. | 06:31 |
| systemdlete | (but maybe not disfavorably, since this would actually work for me) | 06:32 |
| rwp | If a lawyer ever starts a question with "is it possible..." jump in right there immediately and say YES, Yes it is possible, then ask them what the rest of the question is that they were going to ask. | 06:32 |
| systemdlete | I am not a lawyer. | 06:32 |
| systemdlete | I am not even sure if I qualify as a programmer these days. | 06:33 |
| rwp | But in this case I think it extremely unlikely. The semantics of this is covered pretty thoroughly in the POSIX docs, I am sure it is, and Linux was developed to be Unix compatible on these very fundamental features. | 06:33 |
| rwp | I mean I can recall some crazy bugs over the years. But it's been years and these things are pretty well worked out now. | 06:34 |
| systemdlete | and maybe that's where I am getting this vaguely familiar vibe from... | 06:34 |
| rwp | For example I remember a bug where if the file was truncated then some buggy code decided to optimize dropping writing the file and just caused it to be zero sized. But forgot to update the timestamp! That broke "make" which needs that timestamp in order for things to work. That was a bug which hit us. | 06:34 |
| rwp | But again for the most part now there is much testing and these features we are talking about are pretty basic features. Someone would hit it and it would be a big deal and it would get fixed. | 06:36 |
| * systemdlete slowly nods in defeat | 06:36 | |
| systemdlete | like the bugs in thunderbird? lol | 06:36 |
| rwp | Defeat implies that we were in competition. Were we? I feel defeated because I still don't understand your task here. | 06:36 |
| systemdlete | instead, they "fix" the UI repeatedly. | 06:37 |
| systemdlete | No, I meant defeat w/r/t my own perceptions and memory. | 06:37 |
| rwp | The new Thunderbird version is an entirely different fiasco. I don't even use it and it's almost all I hear about from people only coming in 2nd behind the xz attack this week. | 06:37 |
| rwp | Memory is a crazy thing. But that doesn't mean it didn't happen. It just means we must know that we don't know. That's important. | 06:38 |
| systemdlete | MY memory is beyond crazy, I assure you... | 06:38 |
| systemdlete | Maybe I am thinking back to instances where seek was involved, that sort of thing. | 06:39 |
| systemdlete | It has been close to 40 years since I was mucking much with basic system calls. | 06:39 |
| rwp | I think instead of truncating the file that instead you should rotate the file. Move the file then signal rsyslog to re-open the file. That's the standard technique. Since everything is already doing it that way we know it will work that way. | 06:40 |
| systemdlete | Yes. It's just that seems so... inelegant? | 06:41 |
| systemdlete | But I was going to do it that way ultimately | 06:41 |
| systemdlete | I just thought maybe look into some alternative approach that might not involve sliding inodes around | 06:41 |
| systemdlete | I've done it that way in the past for these kinds of files | 06:41 |
| systemdlete | And to address your defeated feelings in this convo, the file that is being gnereated is actually nothing more than a status file. | 06:42 |
| systemdlete | Which is why I said I am really pushing rsyslogd to do something it was never intended to do (though it would be NICE!) | 06:43 |
| rwp | You don't need to tell me but... I am not sure why you need to trigger truncating the file based upon having read it to some point. That's an odd thing. | 06:43 |
| systemdlete | because the contents (i.e., status) will change at some point | 06:43 |
| rwp | And I am not sure why after truncating the file there would be a use to leave a hole in the file of the truncated part and leave the file seek'd to a size prior to truncation. That's also not understood here at all. I would definite NOT want that to happen. | 06:44 |
| systemdlete | the status-checking script or program will read the contents--all of it--to determine the state of something | 06:44 |
| systemdlete | No, definitely never wanted a hole in the file | 06:44 |
| rwp | okay | 06:44 |
| systemdlete | The problem here is strictly my misunderstanding of what happens in the scenario where a file is open by one program and truncated by another. | 06:45 |
| systemdlete | that's all. | 06:45 |
| systemdlete | sorry if I made this confusing | 06:45 |
| rwp | It's okay. It's discussion for learning. | 06:45 |
| rwp | If the file is opened for writing by one program and then truncated by another the original program will still be able to continue writing but after truncation the new location will be offset 0. | 06:45 |
| systemdlete | right. | 06:46 |
| systemdlete | And that will work perfectly for me, if it is dependable and well defined, which is sounds like it is. | 06:46 |
| systemdlete | (per POSIX, etc) | 06:46 |
| rwp | And that is the way legacy Unix systems worked too. I spent probably 20 years working on HP-UX systems which is a legacy Unix and that's the way it worked there. | 06:46 |
| systemdlete | Like I said, maybe the scenarios I (vaguely) recall involved seek() calls. Idk | 06:47 |
| rwp | HP-UX does have some odd quirks. | 06:47 |
| rwp | If one seeks forward in a file past the current end of file and writes then the holes left behind are how files with holes in them are created. | 06:48 |
| systemdlete | I worked with AT&T SysV on 3B2's 3B20's, Amdahl/UTS, HP/UX, AIX, Solaris (and SunOS), as well as a little bit on SGI and a raft of "supermicros" (c. 1985) | 06:48 |
| rwp | If one seeks backward in a file then they overwrite the file from that point. | 06:48 |
| systemdlete | And there were definitely some differences | 06:49 |
| rwp | I worked for HP so HP really only ever wanted to see HP-UX systems in house. But for various reasons we did have SunOS/Solaris and IBM AIX systems at times and then Apollo too after the acquisition. | 06:50 |
| systemdlete | thank you for wading through this with me. I think I can proceed | 06:51 |
| rwp | It was a good discussion! Good luck. Happy hacking! I am going to focus for a few minutes on my own task for a few minutes. | 06:51 |
| friedhelm | After the latest securety update, RSYNC refused to work! All kind of strange errors. A reboot fixed it. So it seems a reboot is required! | 10:36 |
| ted-ious | friedhelm: That shouldn't happen. | 11:39 |
| friedhelm | Yes, It souldn't. The updates all deal with access to the file system. So it makes kind of sense... | 11:53 |
| joerg | systemdlete: man 3p open #>>O_APPEND If set, the file offset shall be set to the end of the file ***prior to each write***.<< | 11:56 |
| joerg | friedhelm: rsyncd daemon kept running while it actually needed a restart? | 12:35 |
| joerg | tried a `needrestart` ? | 12:37 |
| friedhelm | Too late to investiate. After my backup script refused to work, I rebooted and it is working OK now. | 12:39 |
| friedhelm | I just tried to alert everybody, in case you're running into the same problem. | 12:41 |
| joerg | `needrestart -m a -b` | 12:43 |
| joerg | `needrestart -m a -b -r l` even | 12:44 |
| joerg | when the daemon running uses an outdated library... | 12:45 |
| friedhelm | Hmm.. I don't even have it installed... | 12:46 |
| friedhelm | Mybe I should. | 12:47 |
| onefang | A useful tool, it'll tell you what needs restarting when you update things. | 12:47 |
| joerg | without the -b it takes me into an ncurses interactive "GUI" | 12:47 |
| friedhelm | Yes, makes sense. Will install it. | 12:48 |
| joerg | while we're talking about it :-) | 12:50 |
| friedhelm | Hmm... "Use of uninitialized value $ucode_vars{"CURRENT"} in concatenation (.) or string at /usr/sbin/needrestart line 940." | 12:53 |
| friedhelm | Hmm... Works without the -b ... guess I need to read the man page. | 12:58 |
| joerg | tail -n +935 `which needrestart` | head -n 10 ? | 13:10 |
| joerg | or: cat -n `which needrestart` | grep -C5 '^ *940' | 13:16 |
| joerg | actually, looks like needrestart depends on systemctl for restarting services :-/ | 13:30 |
| joerg | https://github.com/liske/needrestart/blob/master/NEWS#L64 no idea | 13:36 |
| CueXXIII | needrestart 3.6 in devuan ceres uses invoke-rc.d here to restart services, and i have no systemctl installed | 13:39 |
| joerg | https://github.com/liske/needrestart/blob/master/README.md >>needrestart supports but does not require systemd (available since v0.6)<< | 13:39 |
| joerg | >> dep: xz-utils << ohmy | 13:53 |
| CueXXIII | xz-utils 5.4.5 should be fine | 14:23 |
| joerg | yup | 14:24 |
| joerg | can you reproduce or guess the error friedhelm quoted above? | 14:26 |
| amarsh04 | checkrestart is in debian-goodies package | 14:26 |
| friedhelm | Deamon was not running! | 14:27 |
| friedhelm | Works now. | 14:28 |
| friedhelm | A reboot would have fixed it. ;-) | 14:30 |
| joerg | :-) | 14:32 |
| joerg | which daemon though? | 14:32 |
| friedhelm | needrestart | 14:33 |
| joerg | it has a daemon? | 14:33 |
| friedhelm | Hmm.. no. I mixed it up. | 14:36 |
| friedhelm | What's the -b supposed to do? It runs fine without, but gives the above error with -b . | 14:37 |
| joerg | batch mode, not open the "GUI" | 14:37 |
| joerg | which doesn't really make sense regarding the error | 14:37 |
| friedhelm | Ah.., I don't have a GUI. Run fvwm2. | 14:38 |
| joerg | it's ncurses GUI | 14:38 |
| joerg | like mc | 14:38 |
| friedhelm | Not here. | 14:38 |
| friedhelm | Running without -b outputs into the xterm. | 14:39 |
| joerg | so what do you get without -b ? | 14:39 |
| joerg | interesting | 14:39 |
| friedhelm | Normal commandline interface. | 14:39 |
| joerg | what I need the -b for | 14:39 |
| friedhelm | It's also possible I'm missing a packet. I allways install without recommends. | 14:40 |
| joerg | https://screenshots.debian.net/package/needrestart the blue ones I get without -b | 14:40 |
| friedhelm | I'm on currend daedalus amd64 xorg and fvwm2. I do have ncurses installed. | 14:41 |
| joerg | aaaah, probably -b only needed when there's actually some restart pending | 14:42 |
| joerg | without anything to show in ncurses, it simply falls through to cmdline | 14:43 |
| joerg | otherwise https://termbin.com/msmf | 14:45 |
| friedhelm | Ahh.. Using my favorite search engine with the errormessage leads to Debian bug #891923. Did not read it yet, but the problem seems to be known. | 14:49 |
| friedhelm | Here is another one: Debian bug #1063719. | 14:54 |
| friedhelm | Except my processor is an Intel Celeron. | 14:55 |
| joerg | >>Lack of iucode-tool triggers the $ucode_vars{"CURRENT"} error<< | 15:05 |
| joerg | https://groups.google.com/g/linux.debian.bugs.dist/c/6MivFljZmQo | 15:06 |
| joerg | https://packages.debian.org/bookworm/needrestart >>sug: iucode-tool << [https://packages.debian.org/bookworm/iucode-tool] | 15:10 |
| joerg | so that's probably a "bogus error message" error that got fixed some time after 2018? | 15:12 |
| AlexLikeRock | hi | 15:41 |
| AlexLikeRock | i got brand new devuan 4 | 15:41 |
| AlexLikeRock | when i run any program , they say : "order not found " | 15:42 |
| AlexLikeRock | i install from refractra live version | 15:42 |
| AlexLikeRock | exam: # grub-install | 15:43 |
| AlexLikeRock | bash: grub-install: order not found | 15:43 |
| AlexLikeRock | then fail | 15:43 |
| AlexLikeRock | i need allways run full path , like this : | 15:44 |
| AlexLikeRock | "/usr/sbin/grub-install" | 15:44 |
| AlexLikeRock | and works | 15:44 |
| AlexLikeRock | how to fix it ? | 15:44 |
| joerg | hah! ~6h old? https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1063719#21 | 15:46 |
| joerg | rwp: ^^^ :-) | 15:47 |
| debdog | AlexLikeRock: if that only happens as root (using su), try "su -" (su SPACE -) | 15:51 |
| AlexLikeRock | yes only happens at root user | 15:52 |
| AlexLikeRock | yes , work with " su - " | 15:54 |
| debdog | O> | 15:54 |
| AlexLikeRock | its an easy & fast : fix it | 15:54 |
| debdog | hmm? | 15:54 |
| debdog | fix what? | 15:55 |
| AlexLikeRock | thanks so much , you guys joerg and debdog | 15:55 |
| AlexLikeRock | my little problem | 15:55 |
| debdog | it's a feature, not a bug, hehe (seriously) | 15:55 |
| debdog | *security feature, they say | 15:56 |
| AlexLikeRock | well, never haapen before at me , from any install | 15:56 |
| AlexLikeRock | hheheheh | 15:56 |
| AlexLikeRock | how to stop this security ? | 15:56 |
| debdog | been like this since at least chimaera | 15:57 |
| AlexLikeRock | to make posible , root , run programs , with out "su -" | 15:58 |
| debdog | AlexLikeRock: search for "su -" in https://files.devuan.org/devuan_chimaera/Release_notes.txt there are some links mayhap they contain information on how to turn this off | 15:58 |
| AlexLikeRock | thanks so much | 15:59 |
| debdog | I usually log in as root via ssh and hence do not have that issue | 15:59 |
| joerg | you'e better off getting used to using "su -" instead of plain "su" | 16:00 |
| joerg | alternative(?): alias su='su -' ;# UNTESTED! | 16:02 |
| debdog | that's prolly a good idea, yes | 16:04 |
| debdog | AlexLikeRock: ° | 16:04 |
| debdog | erm | 16:04 |
| debdog | AlexLikeRock: ^ | 16:04 |
| AlexLikeRock | cool alias , joerg , thansk | 16:05 |
| joerg | AlexLikeRock: realy, better get used to "su -", see man su|less '+/-, -l, --login' | 16:17 |
| AlexLikeRock | oki doki joerg | 16:19 |
| rwp | joerg, Cool! (regarding new needrestart version) Time to test it out to verify the fix! | 16:48 |
| joerg | :-) | 16:49 |
| fsmithred | Excalibur 'apt policy' tells me the following. Does this mean the old version was forward-ported? | 16:51 |
| fsmithred | liblzma5: | 16:51 |
| fsmithred | Installed: 5.6.1+really5.4.5-1 | 16:51 |
| joerg | aiui yes | 16:54 |
| joerg | as a rapid makeshift | 16:55 |
| joerg | fsmithred: https://lists.debian.org/debian-security-announce/2024/msg00057.html | 17:29 |
| fsmithred | joerg, thanks for the bug link. I saw it between my disconnects. Cennection is expected to get worse in a couple hours when it starts raining heavy again. | 19:33 |
| rwp | At least you are getting rain! :-) | 20:08 |
Generated by irclog2html.py 2.17.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!