More

10165 · on May 29, 2017

"grep ^2017-05-29 /var/log/somefile | grep -v 'INFO|WARN' | tail -n5 | cut -f1 -f3"

Did you mean cut -d' ' -f1,3?

Assuming delimiter is a space the above example can be reduced to:

    sed '
    /^2017-05-29/!d;
    /INFO/d;
    /WARN/d;
    s/ /INFO/;
    s/ /INFO/;
    s/ .*//;
    s/INFO.*INFO/ /;
    ' /var/log/somefile \
    |tail -n5

"... compile a set of commands into a single program and run that."

Linux: see busybox ("multicall binary") BSD: see binaries on install media ("crunched binary")

sed and tail are both compiled into busybox re: bsd compiling in tail with crunchgen is easy

As for "objects", the k language can mmap the output of UNIX commands and run parallel computations on those results as "data" not "text". It can be faster than I/O.

eriknstr · on May 29, 2017

I intended for the delimiter to be tab in this example.

Using sed or awk is an option, yes, but I am so used to the standard utilities that I would rather keep using them.

Also I'm not sure if your sed script does what I intended for it to do.

1. Take ever line that starts with 2017-05-29.

2. Out of the lines we have, remove any that contain INFO or WARN.

3. Take the five last of all of those lines.

4. Take the first and third fields of all of those lines.

Let's create a sample file

    cat > /tmp/somefile <<EOF
    2017-05-28T08:30+0200	nobody	ERR: Foobar failed to xyzzy.
    2017-05-29T13:01+0200	nobody	INFO: Garply initiated by grault scheduler.
    2017-05-29T13:37+0200	nobody	DEBUG: Garply exited with 0.
    2017-05-29T14:12+0200	nobody	WARN: Plugh quux corge.
    2017-05-29T14:55+0200	nobody	ERR: PLUGH QUUX CORGE!
    2017-05-30T00:17+0200	nobody	ERR: Failed to retrieve baz needed for thud.
    EOF

(Note that due to how HN formatting works you won't be able to copy-paste this because HN requires leading space to format as code but then it includes the spaces in the output also and we don't want leading space in the file.)

    grep ^2017-05-29 /tmp/somefile | egrep -v 'INFO|WARN' | tail -n5 | cut -f1 -f3

Note: I had a typo in my original example where I said "grep -v" where I meant "egrep -v".

Result:

    2017-05-29T13:37+0200	DEBUG: Garply exited with 0.
    2017-05-29T14:55+0200	ERR: PLUGH QUUX CORGE!

Running it with your pipeline

    sed '
    /^2017-05-29/!d;
    /INFO/d;
    /WARN/d;
    s/ /INFO/;
    s/ /INFO/;
    s/ .*//;
    s/INFO.*INFO/ /;
    ' /tmp/somefile \
    |tail -n5

results instead in:

    2017-05-29T13:37+0200	nobody	DEBUG: exited
    2017-05-29T14:55+0200	nobody	ERR: QUUX

But that's just because you assumed that space was the delimiter when it was not.

Your use of the word "INFO" as a placeholder for the first two occurrences of the space character threw me off quite a bit when reading your script, since the word "INFO" occurs in the file we are working with itself, but it makes sense to use a word that we know is no longer possible to be present since we've already removed all lines that contain it. However, while a neat trick, those kinds of strange hacks are the kinds of things that has brought me to believe that having objects (like they have in Microsoft PowerShell) instead of pure text would be beneficial in Unix also.

---

As for your comments on compiling into a single program and run that, I don't think you understood what I meant, or I don't think that busybox and those chrunched binaries you mentioned perform the optimization I am talking about, do they.

---

Using the k language you mention in that fashion seems more like a hack and will require a lot of work each time. I would rather rewrite all core commands of my system so that they produced true objects, or actually, rather than objects just structured binary data. I don't need the output to have methods you can call.

One of the main things I want from structured binary data is to be able to select the columns of data by name instead of by index and without having a mess of some commands using tab for delimiter and others space and so on.

So instead of

    zfs list -H | cut -f1,3,4

I would like to

    zfs list | cut name used avail

for example, or something like that.

Also all commands that output tabular data must have a "header" command that will show the column headers. So to see the headers that the 'list' subcommand of the zfs command will output, I would say

    zfs header list

And it would tell me

    NAME	USED	AVAIL	REFER	MOUNTPOINT

and furthermore since I am authoring the shell, when you tab to complete a command it will call the binary with the header subcommand and output the headers so that you can have them easily accessible while you're working on a pipeline. And furthermore this shall work with pipes already present.

So if I type

    zfs list | cut

and then tab to complete, it will call

    zfs header list

and pipe that to

    cut header

which will return the input it saw

Naturally, "header" will be a reserved word.

All commands will understand how to work with tabular data.

When you use grep you will either specify which column to use, or you can tell it to look across all columns with *

The shell will only expand * to file names when the * is positionally last in the argument list of a command since all commands will take the list of files as the last ones in their list of arguments.

---

All of this being said, I appreciate all replies, including yours.

10165 · on May 30, 2017

This is the output I got, tab separated:

   2017-05-29T13:37+0200 DEBUG: Garply exited with 0.
   2017-05-29T14:55+0200 ERR: PLUGH QUUX CORGE!
                        ^tab

All you have to do is substitute tab for space.

Ctrl-V then tab. Or if using GNU sed just type \t.

We can reject this simplicity as a "trick" and demand something more complex.

But that suggests that the goal is not to solve the problem, it is to satisfy someone's desire for having some underlying complexity that moves the solution out of the realm of "trivial".

10165 · on May 26, 2017

No Javascript version:

https://raw.githubusercontent.com/opentimestamps/opentimesta...

https://archive.org/download/opentimestamps-calendar-backups...

https://ia801509.us.archive.org/13/items/opentimestamps-cale...

https://github.com/opentimestamps/opentimestamps-client

https://alice.btc.calendar.opentimestamps.org/calendar/ https://bob.btc.calendar.opentimestamps.org/calendar/ https://finney.calendar.eternitywall.com/calendar/

10165 · on May 26, 2017

"Mozilla's founding mission was to build the Web by building a browser."

I have been using the Web since 1993.

I always thought the Web had the definition given in its Wikipedia entry.

Web servers and hyperlinks.

Apparently some people believe "the Web" is actually a browser, or a small set of them.

This is like suggesting a galaxy is actually a telescope, or a small set of "standards compliant" telescopes.

Browser-centrism is truly myopic.

What is relevant is the ever expanding practice of installing web servers into all sorts of devices, not simply racks of computers in server rooms and offsite data centers.

10165 · on May 23, 2017

"Javascript Required." "Oh snap! Your browser doesn't support Javascript."

I have seen so many of these Javascript-only "websites" posted on HN I am wondering is this coming from some web development template? How difficult is it to have a page with text for those not using Javascript? Something like

  <html class=nojs>
  <p>This website was designed for browsers that run Javascript.  Are you using one?  Here are some examples of browsers that work well with our website: browser1, browser2, etc.  Alternatively, a no-JS version of the website is available <a href=https://blockstack-site-api.herokuapp.com/v1/blog-rss>here</a>.</p>
  </html>

There are of course other ways to do this. The point is that it can be done and is not difficult.

For those not using or with Javascript disabled:

https://github.com/blockstack/blockstack-core

And a blog

   curl -o 1.htm https://blockstack-site-api.herokuapp.com/v1/blog-rss
   tr -cd '\12\40-\176' < 1.htm > 2.htm
   xyz 2.htm

where xyz is some program that displays html or rss.

bsilvereagle · on May 23, 2017

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/no...

cavanasm · on May 23, 2017

There are a bunch of very popular web development frameworks where pretty much all the functionality comes from the javascript. Angular and React are some of the better known ones.

majewsky · on May 23, 2017

I can sorta kinda understand that you would use something like that for a rich web application, but this is literally a blog post. It doesn't get any more static than that.

root_axis · on May 23, 2017

one of react's differentiating qualities is that it can render the entire page on the back-end, so there's no reason for the page to fail if the user does not run javascript.

giobox · on May 23, 2017

Whilst this is often true, not all React code will render server side. You need to take care to make sure your React code stays isomorphic, which usually lasts right up to the point you discover some third party React component you reused would require significant rewriting to support server side rendering. That's how my personal projects usually end up anyway!

mnm1 · on May 23, 2017

It's not difficult at all, but is either overlooked or left out on purpose so you can't read the article without them tracking you. For sites that have active developers and cashflow, it's always the latter.

astrobe_ · on May 24, 2017

"The internet is broken. It has been for a while" is the first words of the article you cannot read because of Javascriptosis. Quite ironic.

contingencies · on May 25, 2017

Haha. Added this commentary to the https://github.com/globalcitizen/taoup fortune database in the section 'randoms on change'.

bobbygoodlatte · on May 26, 2017

I'd argue Javascript has largely 'un-broken' the Web. It transformed it from a static, pageload-based medium to something far more dynamic.

10165 · on May 22, 2017

Fact: Certain federal judges were getting a disproportionate amount of patent cases.

Consider: Over time, it is possible judges in other jurisdictions did not like this. (Why?)

Question: Can anyone assume that these other judges not in, e.g., ED Tx, will not also be "plaintiff-friendly"?

Consider: Being "plaintiff-friendly" can have the effect of more patent cases being filed in the judge's jurisdiction. Further consider that some judges may want more patent cases filed in their jurisdiction.

As such, the headline may be prematurely drawing conclusions. Or not. Will patent litigation continue to rise, will it remain steady, or will it begin to fall?

bkmartin · on May 22, 2017

Call it coincidence, or nepotism, or maybe something worse... both of the judges that hear most of the patent cases in ED Tx have son's that represent most of the trolls in court.

I would love to see the Justice Department or FBI investigate the financial relationships between the Judges and their sons. Are they jointly invested in anything that could be used to launder funds and provide kickbacks? Or are these just two fathers that are proud to see their kids get rich?

rudyfink · on May 22, 2017

To my knowledge, the statement "both of the judges that hear most of the patent cases in ED Tx have son's that represent most of the trolls in court" is incorrect as written. The comment may be thinking of former judges, who retired in 2015 and 2011, but even for those judges I am skeptical of the "most of the" portion of the assertion. To be clear, the former judges had sons who practiced in ED Tex (and still do), but I would be surprised if they ever represented the most plaintiffs in a given year (either individually or in combination), though I admit I have not pulled the numbers to check.

For the current judges in the district, I am only aware of one who has a child that is an attorney, and to my knowledge that child does not practice in ED Tex (and does not practice in patent cases at all).

10165 · on May 22, 2017

"Every commodity system that has tried to do without PMMUs has failed..."

I would like to read about these. Anyone have any pointers?

DjvuLee · on May 25, 2017

Yes, I am curious about this argument too. Is there any links related with this?

10165 · on May 22, 2017

I know there was just a discussion yesterday on how amp is awful but it still is useful, e.g., to read WSJ articles.

   curl -o 1.htm https://www.wsj.com/amp/articles/the-quants-run-wall-street-now-1495389108
   sed -n '/./{/<title/,/<\/title/p;/<p>/,/<\/p>/p;}' 1.htm > 2.htm

FWIW, 2.htm has no amp elements, no Javascript, no images, no ads, no externally sourced resources and therefore no tracking.

Add links to non-essential images (cf. auto-loaded by browser). With available captions.

  sed -n '
  /./{/div class=.image/,/<\/div/!d;s/ *//;}
  /src=/{s///;s/\"//g;s/.*/<a Href=&>&<\/a><br>/;}
  /alt=/{s///;s/[\">]//g;/./s/.*/<P>above: &<\/p>/;}
  /Href=/p;/<P>/p' 1.htm >> 2.htm

nathan_f77 · on May 22, 2017

Thanks very much for this script, it was really refreshing to read such a minimal webpage.

I added some bare-minimum CSS to make it a little nicer to read. Full command (with in-place sed):

    curl -o article.htm https://www.wsj.com/amp/articles/the-quants-run-wall-street-now-1495389108
    sed -n '/./{/<title/,/<\/title/p;/<p>/,/<\/p>/p;}' -i article.htm
    echo "<style>html { text-align: center; padding: 36px; } body { max-width: 600px; text-align: left; margin: auto; }</style>" >> article.htm

PhantomGremlin · on May 22, 2017

This is excellent. Most times I'm willing to live with just the text.

The article had some pictures and graphs (see archive.li someone else posted). They were nice but they weren't essential.

duckmuck · on May 22, 2017

Can you explain to me what I am looking at here? (curl -o 1.htm... >2.htm)? And how I can use it to view an AMP page?

nl · on May 22, 2017

It's two separate lines.

The first line uses curl to download the AMP file to 1.htm

The second line use sed to replace some elements in the HTML and writes it out to 2.htm

zitterbewegung · on May 22, 2017

It saves the page after the filtering that is done in the command and you can open the 2.htm to view the page.

themodelplumber · on May 22, 2017

How is this different from using 'links' or 'w3m'?

10165 · on May 22, 2017

Using links is both better and easier.

amp html pages look great in links.

taeric · on May 22, 2017

This made me wonder if it would render well in emacs using eww. Surprisingly well rendered, actually. Odd that it doesn't have any pictures. But easy to read.

guiambros · on May 22, 2017

This is excellent, particularly for reading with lynx :)

10165 · on May 21, 2017

I read that Windows 10 uses peer-to-peer file sharing with any other Windows hosts it locates on the same network.

This way each Windows computer does not have to connect to Microsoft to download, e.g., the Windows 10 "upgrade". It seems like this could also be used to evade attempts by users to block such downloads by blocking Microsoft IP addresses.

Windows 10 could propagate itself through a network of Windows computers, like a ...

Seriously, how does this work in pratice?

Windows 10 does peer-to-peer file sharing automatically without requiring any user interaction?

hendersoon · on May 21, 2017

Yes, this is called "delivery optimization" and it's on by default. By default, Windows Enterprise/Education only pull updates from Microsoft and the local domain, while Windows Home/Pro will also pull updates from other peers on the internet.

You can turn it off, or disable pulling from internet peers, but given the OP, who knows if MS actually respects that setting? I guess we have to roll the dice now.

https://privacy.microsoft.com/en-us/windows-10-windows-updat...

glenneroo · on May 21, 2017

Maybe they changed it, but I just checked my fresh install of Pro build 1703 from MSDN and it was disabled.

hendersoon · on May 21, 2017

Odd, their docs say it's on by default.

https://docs.microsoft.com/en-us/windows/deployment/update/w...

glenneroo · on May 23, 2017

One of the handful of tools I installed from Ninite must have turned it off then. After a quick look, I'm guessing it was Classic Start Menu.

nthcolumn · on May 21, 2017

malware delivery optimization hehe

Spooky23 · on May 21, 2017

Windows has done this forever with enabled apps like SCCM. I think it was released in 2008.

It allowed us to remove hundreds of local depot servers unless there was an SLA on reimaging.

hendersoon · on May 21, 2017

Yes, but you needed to configure that. This comes enabled by default. And on Windows Home/Pro, it downloads updates from third-parties on the internet too.

jhasse · on May 22, 2017

> This comes enabled by default. And on Windows Home/Pro, it downloads updates from third-parties on the internet too.

What's the problem? I guess it still receives the hashes signed from Microsoft.

10165 · on May 20, 2017

Regarding the "right to go offline", I see an increasing trend of non-networking software that inexplicably "requires" the user to have an internet connection.

This trend may have the effect of coercing users to stay connected. (Even if the "requirement" is not truly a requirement but merely a suggestion or recommendation disguised as a directive.)

As such, users with the "right to go offline" may not do so because a company is telling them they must stay connected in order for some (non-networking) software to work.

There are many examples of such software, and at the risk of annoying some people, I will provide some.

But the nature of my question arises from the simple idea that sometimes software can accomplish it purpose without an internet connection, as will be familar to anyone who used such software before internet connections were inexpensive, "always on" or fast.

This a broad concept. It applies to all software.

Random example 1: Professional/hobbyist audio recording, editing software

Random example 2: Unnamed operating system setting world records for number of "updates"

"Office suite" software, e.g., word processor, spreadsheet, etc.

Can a user record and edit audio without having an internet connection?

Can a user read, create, edit a document or spreadsheet while being disconnected from the internet?

There are reasons that companies want users to stay connected.

However users are not always given full details on those reasons.

Obviously leaving computers connected poses risks for the user.

Users have to weight those risks. Should users be entitled to the full details? (Without having to use a program like "Little Snitch".)

The first question to ask is: Can a given program accomplish it purpose without using the network?

If yes, then the next question is: Why does a company "require" a user to have a working internet connection for the use of this software?

Free use of a user's internet connection by a company enables collecting user data and potentially generating revenue from user data, e.g., through advertising.

But should users give away their network bandwidth to companies to use however they see fit?

Even more, should users give away their RAM as if it was an inexpensive, infinite resource?

Software programs that generate revenue routinely increase "minimum RAM requirements" year by year but many times users receive no details on why the increases are needed.

The reasons could be legitimate however they might also be questionable. Without consideration of the undisclosed details, how can users make informed decisions?

TeMPOraL · on May 22, 2017

The reasons are not legitimate, as evidenced by the very fact such software worked just fine "before internet connections were inexpensive".

The reason behind software forcing you to be on-line are quite simple: greater control and money-making potential.

- SaaS model makes a shit ton of money on the Internet; it makes deployment orders of magnitude cheaper (especially as it's cross-platform deployment), but it opens the possibility of (as the name suggests) turning what should be a product into a service - so now you get billed continuously for what you'd rather buy once, and this is ultimately possible because you can't pirate other people's servers.

- Entrepreneurs, seeing success of SaaS model, are trying to shove it everywhere. On the one hand, a lot of software that should stay native is moving into the web. On the other hand, there is this idiotic push of turning hardware into SaaS by connecting it to the cloud.

- If you're not aiming at renting your software away for money, there's at least possibility of money-making by selling data it collects.

And while I think a lot of permanently-connected software is simply designed with malicious intent, there's also developer laziness. Too lazy to learn anything but JavaScript? Let's make an Electron app. Too lazy to learn how to build native software? Let's host everything on our side and make an "embedded browser" mobile app. Too lazy to actually go out and ask users what are their problems? Let's hide "analytics" in the app[0]. And the "dev laziness" argument also explains why our applications still do mostly the same things they did 10 years ago, but max-out our current CPUs.

This all should be opposed, but I don't see it happening - the commercial and laziness incentives are all turning everything into cloud-first SaaS solutions.

--

[0] - https://news.ycombinator.com/item?id=11566720

10165 · on May 19, 2017

Commenters are confusing ads versus tracking.

Remember that it is possible to track users without the use of ads.

A browser written by an organization that profits from ad revenue or collecting user information (hereafter "well-known browser") will load elements, e.g., images, in a web page automatically.

No user interactivity is required. The user need not "click" anything. The user may not even be able to see the element loaded.

Email clients supporting HTML email can do the same thing, loading images automatically, hence suporting a method of tracking.

This is a very old method but still widely used.

What if the user is not using a "well-known browser"? What if those elements will not be loaded automatically? Will these methods of tracking still work?

All methods of tracking, other than IP addresses in access logs, rely on assumptions. Many rely on assumptions about usage of a "well-known browser".

The assumption re: automatically loaded elements, "beacons" or whatever one wants to call them, is that the user is using a "well-known browser" that will load elements automatically. If the user is not using a "well-known browser", all bets are off?

Another example is the HTTP header "fingerprint". HTTP headers are tied to "well-known browsers". What if suddenly all users decided to only send the same minimal headers? In the way that some server software might try to hide its version (e.g., BIND) imagine that users decided to hide their client software version.

Aside from IP addresses, many methods of web tracking are heavily reliant on assumptions about use of "well-known browsers" and the behavior of those browsers. Could these assumptions ever fail to be true? Can users think for themselves?

The www as a medium for exchanging information or even doing commerce does not necessarily require the use of any particular browser. That "requirement" is only imposed by certain sites on the www, for reasons that may ultimatley benefit the site owner more than its users. No such "requirement" is imposed by the www itself.

Thinking of this in terms of "a carrot and a stick", as far as I have seen using the www since 1993 there is only a carrot in the form of a "well-known browser". There is no stick. Users are free to make HTTP requests using any client they choose, including ones that do not expose them to advertising or tracking. Such clients may not require an "adblocker" because they do not requests elements automatically.

There used to be and perhaps there still is a never-ending battle between commercial entities over which is the "default browser" in a graphical OS. Certain companies tried to coax users into using certain "well-known browsers". There was even a large antitrust case in the US over this issue.

The implication seemed to be that if not set by default users might otherwise choose some other HTTP client to interact with the www. In those days one company wanted to sell a browser as enterprise software. Today that browser is owned by a "non-profit" organization of salaried employees. Other well-known browsers are owned by "for profit" (subject to taxation) commercial entities with thousands of employees.

Today, these well-known browsers are "free". And yet these browsers are written by salaried employees, not open-source project volunteers. These entities continue to market their "free" browsers aggressively to users.

As a user, ask yourself why.

Ads? Tracking?