| Previous revision |
— | fuss:unix [2025/05/29 03:21] (current) – [Find Last Modified Files] office |
---|
| ====== Secure Remove Instead of Remove ====== |
| |
| [[http://srm.sourceforge.net/|srm]] is a multi-pass overwrite and secure delete tool designed to be backward compatible with ''rm''. The tool is already present on OSX. |
| |
| <code bash> |
| mv /bin/rm /usr/bin/rm.insecure |
| ln -sf /usr/bin/srm /bin/rm |
| </code> |
| |
| Moving files is not sufficient. The best way to make sure that a file is deleted is to copy the file and then wipe the old copy. It should be noted that deletion operations will take much longer and may impose stress on other software. For a better solution, please see the [[unix:system-wide_secure_remove|system-wide secure remove page]]. |
| |
| ====== Wipe Free Space ====== |
| |
| Virtual images can be shrunk by first zeroing out the free space available: |
| <code bash> |
| bcwipe -mz -F -S -v / |
| </code> |
| and then by using a compression format such as [[https://people.gnome.org/~markmc/qcow-image-format.html|qcow2]] supported by ''qemu'': |
| |
| <code bash> |
| qemu-img convert -O qcow2 image.raw image.qcow2 |
| </code> |
| |
| ====== Use S.M.A.R.T. To run a Hard-Drive Test ====== |
| |
| To schedule a test, issue: |
| <code bash> |
| smartctl -t short /dev/sda |
| </code> |
| |
| where ''/dev/sda'' is the device to run the test on. |
| |
| The process takes a few minutes, after which you can issue: |
| <code bash> |
| smartctl -l selftest /dev/sda |
| </code> |
| |
| to check the results. |
| |
| The results will display, something like the following: |
| <code> |
| # 1 Short offline Completed without error 00% 11482 - |
| |
| </code> |
| in case the tested completed without errors, or: |
| |
| <code> |
| # 1 Short offline Completed: read failure 90% 23678 200910 |
| |
| </code> |
| |
| to indicate failures. |
| |
| ====== Show File Encoding ====== |
| |
| To determine the encoding of the file ''document.txt'', issue: |
| |
| <code bash> |
| file -bi document.txt |
| </code> |
| |
| ====== Change File Encoding ====== |
| |
| To convert a file ''input.txt'' from ASCII to a new file ''output.txt'' with UTF-8 encoding, issue: |
| <code bash> |
| iconv -f ascii -t utf8 input.txt > output.txt |
| </code> |
| |
| since UTF-8 contains characters that cannot be encoded with ASCII, the reverse command will generate errors: |
| <code bash> |
| iconv -f utf8 -t ascii ouput.txt > input.txt |
| </code> |
| unless we add the ''-c'' flag that strips non-ASCII characters: |
| <code bash> |
| iconv -c -f utf8 -t ascii ouput.txt > input.txt |
| </code> |
| |
| |
| |
| ====== Generate Unsalted MD5 Password Hash ====== |
| |
| You can generate an MD5 password using ''md5sum'' and ''echo -n'': |
| <code bash> |
| echo -n "mypassword" | md5sum |
| </code> |
| where ''mypassword'' is the password to hash. |
| |
| ====== List Folder Contents with Octal Permissions ====== |
| |
| Using ''awk'': |
| |
| <code bash> |
| ls -l file | awk '{k=0;for(i=0;i<=8;i++)k+=((substr($1,i+2,1)~/[rwx]/) *2^(8-i));if(k)printf("%0o ",k);print}' |
| </code> |
| |
| where ''file'' is a file to query. |
| |
| ====== Find Large Files ====== |
| |
| The following command uses ''du'' to report the size of folders for the entire system while reporting folders containing over ''1GB'' of data: |
| <code bash> |
| du -h / | grep ^[0-9.]*G | sort -rn |
| </code> |
| |
| The same can be achieved in order to find folders over ''100MB'': |
| <code bash> |
| du -h / | grep ^[1-9][0-9][0-9][0-9.]*M | sort -rn |
| </code> |
| |
| ====== Determine if Operating System is 32 or 64 bits ====== |
| |
| The command: |
| <code bash> |
| getconf LONG_BIT |
| </code> |
| will print ''32'' or ''64'' depending on whether it is a 32 or 64 bit machine. |
| |
| ====== Find Last Modified Files ====== |
| |
| This can be accomplished using ''find'': |
| <code bash> |
| find . -mtime -5 |
| </code> |
| |
| which will find the files that were modified since 5 days ago. |
| |
| ====== Sleeping Efficiently with Sleep Infinity ====== |
| |
| One of the anti-patterns that are widespread in programming is the bad design of writing code that waits forever in a loop. The code typically uses some sleeping function that makes the CPU spin and awake at a specified time, whilst making the loop condition consists in a tautology. |
| |
| Even though seemingly innocuous, the following code-pattern, here written in shell, is frequently found in the wild: |
| <code bash> |
| sleep 3600 |
| </code> |
| |
| which, when inspected with ''strace'' reveals the following sequence of calls: |
| <code> |
| strace sleep 3600 |
| execve("/usr/bin/sleep", ["sleep", "3600"], 0x7ffd82e7c368 /* 20 vars */) = 0 |
| brk(NULL) = 0x5561d92ce000 |
| mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f58db335000 |
| access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) |
| openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 |
| newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=24946, ...}, AT_EMPTY_PATH) = 0 |
| mmap(NULL, 24946, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f58db32e000 |
| close(3) = 0 |
| openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 |
| read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20t\2\0\0\0\0\0"..., 832) = 832 |
| pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 |
| newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0 |
| pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 |
| mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f58db14d000 |
| mmap(0x7f58db173000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7f58db173000 |
| mmap(0x7f58db2c8000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7f58db2c8000 |
| mmap(0x7f58db31b000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7f58db31b000 |
| mmap(0x7f58db321000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f58db321000 |
| close(3) = 0 |
| mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f58db14a000 |
| arch_prctl(ARCH_SET_FS, 0x7f58db14a740) = 0 |
| set_tid_address(0x7f58db14aa10) = 113625 |
| set_robust_list(0x7f58db14aa20, 24) = 0 |
| rseq(0x7f58db14b060, 0x20, 0, 0x53053053) = 0 |
| mprotect(0x7f58db31b000, 16384, PROT_READ) = 0 |
| mprotect(0x5561d17ba000, 4096, PROT_READ) = 0 |
| mprotect(0x7f58db367000, 8192, PROT_READ) = 0 |
| prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 |
| munmap(0x7f58db32e000, 24946) = 0 |
| getrandom("\x13\x40\x4d\xf5\x7b\x21\x69\x65", 8, GRND_NONBLOCK) = 8 |
| brk(NULL) = 0x5561d92ce000 |
| brk(0x5561d92ef000) = 0x5561d92ef000 |
| openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 |
| newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=3048928, ...}, AT_EMPTY_PATH) = 0 |
| mmap(NULL, 3048928, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f58dae00000 |
| close(3) = 0 |
| clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3600, tv_nsec=0}, |
| |
| </code> |
| |
| As can be observed, the ''clock_nanosleep()'' function call is the last call in the stack that, as per the manual page will awake iff. either: |
| * at least the time specified by t has elapsed, |
| * a signal is delivered |
| |
| It is little known that POSIX signals includes the ''SIGPAUSE'' signal which is a signal that will pause the process until a ''SIGNCONT'' signal is delivered. In principle, this method is shorter in case the time that the program has to wait does not need to be precise (for example, "about an hour...") with the former counter-intuitively covering a large amount of usage cases: most programs that pause for an hour, do not really care that it is exactly an hour, but the programmer typically just wants the process to wait about an hour. |
| |
| With that being said, in the latest iterations of ''coreutils'', the ''infinity'' parameter has been introduced, such that the following sleep can be used for infinite sleep: |
| <code bash> |
| sleep infinity |
| </code> |
| and it can replace infinite loops where the point of spinning is just to wait for forever (for example, ''sleep'' with a very large value passed as parameter). |
| |
| Interestingly, "under the hood", as car mechanics would say, ''sleep'' branches on the ''infinity'' which uses the ''SIGPAUSE'' and ''SIGCONT'' method. Here is the proof, under ''strace'': |
| |
| <code> |
| strace sleep infinity |
| execve("/usr/bin/sleep", ["sleep", "infinity"], 0x7ffcb8ccda58 /* 20 vars */) = 0 |
| brk(NULL) = 0x556b698ee000 |
| mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff1e45e1000 |
| access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) |
| openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 |
| newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=24946, ...}, AT_EMPTY_PATH) = 0 |
| mmap(NULL, 24946, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff1e45da000 |
| close(3) = 0 |
| openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 |
| read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20t\2\0\0\0\0\0"..., 832) = 832 |
| pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 |
| newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0 |
| pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 |
| mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff1e43f9000 |
| mmap(0x7ff1e441f000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7ff1e441f000 |
| mmap(0x7ff1e4574000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7ff1e4574000 |
| mmap(0x7ff1e45c7000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7ff1e45c7000 |
| mmap(0x7ff1e45cd000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ff1e45cd000 |
| close(3) = 0 |
| mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff1e43f6000 |
| arch_prctl(ARCH_SET_FS, 0x7ff1e43f6740) = 0 |
| set_tid_address(0x7ff1e43f6a10) = 118396 |
| set_robust_list(0x7ff1e43f6a20, 24) = 0 |
| rseq(0x7ff1e43f7060, 0x20, 0, 0x53053053) = 0 |
| mprotect(0x7ff1e45c7000, 16384, PROT_READ) = 0 |
| mprotect(0x556b55996000, 4096, PROT_READ) = 0 |
| mprotect(0x7ff1e4613000, 8192, PROT_READ) = 0 |
| prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 |
| munmap(0x7ff1e45da000, 24946) = 0 |
| getrandom("\xd4\xf9\x41\x0e\x23\x80\xab\xf4", 8, GRND_NONBLOCK) = 8 |
| brk(NULL) = 0x556b698ee000 |
| brk(0x556b6990f000) = 0x556b6990f000 |
| openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 |
| newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=3048928, ...}, AT_EMPTY_PATH) = 0 |
| mmap(NULL, 3048928, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff1e4000000 |
| close(3) = 0 |
| pause( |
| |
| </code> |
| |
| As can be observed, instead of ''clock_nanosleep()'', the last function call is ''pause()'' which, as per its definition: |
| * returns only when a signal was caught |
| |
| In doing so, the program can now only be awakened when a signal is delivered and does not use ''clock_nanosleep'' to wait for a certain time to elapse. |
| |
| One of the performance gains here is that the process (or thread, since ''pause()'' can be used within threads) does not use ''clock_nanosleep()'' anymore. ''clock_nanosleep()'' like most other sleep functions are programmed with precision in mind and even have a full historical backlog where earlier computers did not even have a real-time clock, programmers typically using the screen refresh rate as a clock source by leveraging its oscillation rate or pinning the clock to some other hardware oscillator (disk drive shutter, would have been an option) such that "modern clocks", for example the High Precision Event Timer (H.P.E.T.) are extremely precise but they incur serious performance penalties. With that said, the loss of performance is in vain in cases where the precision of the wait time does not matter at all. |
| |
| In terms of standard programming, processes (or threads) can be suspended (and can suspend themselves) with ''SIGPAUSE'', for instance by using the ''signal()'' function and then awakened with ''SIGCONT''. It is also perfectly valid within a process, say, spanning multiple threads, that one thread suspends itself using the same mechanism, only for the thread to be awakened later. The general idea is that code that is written to spin and wait is much less efficient than event-driven code, which is the design idea behind ''node.js''. In this case scenario, the sleeping can be interrupted at a later time when a specific event takes place instead of having to spin around and wake up periodically in order to check whether some work has to be done. |
| |
| |