It's the extensible nature of XML that gives it an advantage. You can add custom elements and attributes whilst conforming to the base schema.
Granted, XML isn't the only format where this is possible. You can sort of achieve it with JSON, though XML's namespace system helps deal with name collisions. Adding bank-specific messages wouldn't be possible (or would be difficult) with fixed-column formats, for example, unless they had been specifically designed to be extended.
Banks add their own features to the spec - imagine they want to add a new "Bank only" attribute that makes their XML schema differentiated and better in some way.
ISO20022 / XML allows this to be possible without breaking anything. In the past payment formats used to be fixed width text files - impossible to change or improve functionality for
Excellent example clearly from a fellow soldier from the trenches!
As somebody who has built several instances of both payments- and travel booking systems, I have seen things in systems that "adhere to published schemas" (often because the schemas were beastly, design-by-committee hellscapes of extensibility) that defy belief.
While there is a strong argument to be made that strict type systems in programming langues like Haskell and Rust make it very difficult to play outside of the rules, this is unfortunately not the case in practice when it comes to APIs - both at present where you have a JSON Schema or Open API spec if you are lucky, and in the past (XML Schema, SOAP).
I wish that the ability to express a tight, fit-for-purpose schema or API actually resulted in this commonly being done. But look at the average API of any number of card payment gateways, hotels, or airlines, and you enter into a world where each integration with a new provider is a significant engineering undertaking to map the semantics - and the errors, oh the weird and wonderful errors... to the semantics of your system.
I am glad to work in the space-adjacent industry now, where things are ever so slightly better.
(Note the lack of sarcastic emphasis - it really is only _slightly_ better!)
This has me a little dumbfounded as either really profound or slightly misguided. How do you mean?
As I am reading this you think a custom schema wont effect an implementation, but how do you expect to implement an external service (API for example) without the required defined schema. That's kind of the definition of a schema in this scenario.
Extending the schema might be another thing. But implementation can't work without adhering to the defined schema of the provider? Right?
I have been switching my USB peripherals (dongles mostly) and display ports for a while now within my desktop setup, and just now become aware of existence of these. I'm wondering how does this kind of switching work with e.g. Bluetooth.
EDIT: Apparently, based on HN search, many HN users have realized the same within the past few years.
I don't think i've seen one that can handle bluetooth devices, unless maybe you plugged a bluetooth dongle into the switch?
i'd go more with multi device keyboards and mice. like the logitech mx keys for example. have mine paired to 3 devices currently - easily toggle between them with a switch on the keyboard. Granted its not as seamless but sometimes i just want to quickly respond to an email on my personal rig while still watching work queues.
they also have mice/trackballs with the same tech - although the mxmaster mouse has the switch on the bottom so not as convienent.. and the mx ergo trackball only toggles between 2 devices.
other options if screen realestate isn't as important are things like the genki shadow play. While targeted at gamers - it works great as an HDMI input device. I use mine to work on raspberry pi projects on my main system. Its view only - won't pass control through it, but sometimes thats all you really want.
and on the other end of the spectrum, full display port KVM's like these from Level 1 techs:
Thanks for sharing! I actually have MX Mechanical keyboard in my setup, which uses Easy-Switch function, but I might need to get multiple dongles. As far as I have understood, in a KVM setup it's one port to connect from a target device to the switch, which is rather a convenience.
Of course, you might run up into a problem that you actually don't want to plug some devices into the switch when switching to another PC (for example some audio equipment)... So in a dream device, there would be switches on-off switches for ports (or groups of ports) and target PC enum switch. Probably some modular approach would work for making groups of ports switchable.
I disagree with the author of the parent comment in regards of using SQL and using Spark instead. I actually first wrote my "SQL advocation" as a reply to this comment but decided to leave leave this view for what it is and write my own "rant" against complicating "big" data transformations with Spark or EMR (Hadoop Pig) or vendor-locked Spark-instrumentations like AWS Glue.
But I agreed with the parent comment's author about pretty much anything until the third bullet point of the second list. I'd like to get more reasoning behind his SQL hate.
I'll express an advocation for using SQL as a data pipelining language. Firstly, many SQL dialects are multi-platform and provide standardization for transformations. It's a declarative language that doesn't define how computation happens but what.
Where SQL is terrible to write is when one must pivot data. Each column transformation is defined separately (case whens). When the cardinality of a pivoted vector is high, it results in quite a verbose declaration. This problem can be mitigated for example by generating SQL programmatically with templating languages such as Jinja2. Rendering is handled nicely on platforms such as Airflow when running the rendered SQL in cloud (for example on top of Redshift or Presto cluster, BigQuery).
For writing complex transformations, UDFs and cascading subqueries are the way to go. Window functions are useful for scanning subsets of column values (useful for example in vector transformations [doing normalization, regularization etc.])
SQL is also a language with a gentle learning curve which makes it easy to learn for less software-engineering-minded people (BI people and analysts of different departments in a decentralized data science organization). It's established itself as a lingua franca for matrix transformations already for decades.
Data processing is usually done in batches of different intervals as in traditional data science nothing really needs real-time processing for single events. Then Spark shines. But I would rather make a tradeoff of using SQL and Spark side by side when handling real-time processing than losing benefits of using SQL that I listed above.
When data transformations – with some object ontology related to it other than "just maths" – are to be done real-time, then you better start thinking about building an application for that (using your favorite programming languages).
Even with Spark, around 70% of work is done in SparkSQL.
I love SQL. But it hard to get other DS on board who think that 40 lines of Spark is better than a 10 line SQL query.
The only thing that worries me with SQL is when having to write UDFs for, say, computing a Z-score. But maybe it's just because I have never done it? Do you have any good resources about this?
Don’t worry, I’m having my battles convincing my clients (both business and DS/DEs) that this is a viable paradigm. Here’s a nice-looking z-value recipe by Silota that I just googled up: http://www.silota.com/docs/recipes/sql-z-score.html
I'd just go and write out the technical architecture, defining what are the inputs (the raw data) and what are the outputs (matrices for training, testing etc. etc.) on different intervals (usually, data scientists want the previous days' data processed into some format, A/B test results and such) and how are you going to instrument those transformations. It's not just SQL but the DB where that SQL would be run and orchestration (for example with Apache Airflow), and for concrete ETL tasks (nodes in a processing graph) using a combination of open-source modules (usually in Python) and Bash scripts.
It takes time to get experienced in explaining and mapping these things to the domain.