Tech Bytes - Daily Digest

Daily Tech Digest - March 21, 2022

Improve agile and app dev meetings

Sometimes, it’s the choice of tools for hybrid work that can simplify remote collaboration. Sometimes it’s how organizations, teams, and people use them. ... These basic tools help agile teams manage their priorities, requirements, and status to complete sprints and releases. There are also opportunities to improve collaboration with product owners and stakeholders using advanced road maps and sharing views of Jira issues on Confluence pages. Another option is to reduce the complexity in developing applications, dashboards, and data integrations with low-code and no-code tools. These tools can cut the time and collaboration required to prototype, develop, test, and deploy capabilities, and their visual programming modalities often lessen the need to create detailed implementation documents. Rosaria Silipo, PhD, principal data scientist and head of evangelism at KNIME, agrees and says, “Low-code tools are becoming increasingly popular and deliver the freedom to prototype swiftly. They enable the collaboration of simple to complex app dev within an integrated platform, where steps are consumable by technical and non-technical stakeholders.”

Is UX design regressing today ?

Internet users are confronted all day long with various sites, each with its logic, rules, and UX design. There is a need for flexibility on the part of users as they adapt throughout their day to applications that feel they have achieved the perfect logic for a good user experience. Every company has a website, a page on all the major social networks, an application. SAAS are multiplying and smartphones are more and more used to doing everything. This need for the digital presence of all companies has made the need for Ux designers explode. As Ux has become something commonplace, non-experts have expectations of Ux designers, the expectation of designing an application that pleases. The core of the problem lies in this level of design, an application is not used in isolation. It is linked to dozens of others and is part of a life where digital is always present. If some people feel that Ux design is regressing, it’s because of the lack of consideration for the ecosystem in which the applications will evolve. All the rules of Ux design can be perfectly applied, but will still create friction if the logic of use has been thought only for the application and not for the ecosystem.

Documenting the NFT voyage: A journey into the future

The most crucial task for any NFT project is to focus on innovative design and diversified utilities for its users. Moreover, the first-to-market NFT project will always have the edge over other competing projects to generate value. Unfortunately, while making copies of the original (forks) is easy, it does not always translate into a successful project. For example, the legendary Ethereum-based CryptoPunks from Larva Labs is the inspiration behind PolygonPunks residing on the Polygon blockchain. Although PolygonPunks is very successful, many consider it a ‘derivative collection’ that can compromise buyers’ safety. This is why the NFT marketplace OpenSea delisted PolygonPunks after a request from developers at Larva Labs. The second characteristic of a good NFT project is how strong the community is. A genuinely decentralized project with a well-knit community goes a long way in making it a success. As demonstrated above, the Pudgy Penguins and CryptoPunks communities are robust enough to protect the legacy of the projects. Moreover, interoperable NFTs help forge communities across blockchain networks, making them stronger.

“DevOps is a culture, it's not a job description”

In contrast to traditional software development lines, whereby those in product would define the product, pass it to the developers, who would send it to the testers, who would then assess its quality before sending it out for wider use, the ‘Dev-Centric’ culture at Wix advocates that the developer should remain in the middle of that process; it turns the assembly line into a circle with the engineer sitting comfortably within the compounds of all the other departments - the movie star in his own film in charge of filming and the final edit. “DevOps is a culture, it's not a job description… the DevOps culture, it’s kind of intertwined with continuous delivery. It is the culture of giving the developers the responsibility and ability to deploy their product end to end… DevOps is not a job description and I didn't want the people here in the company to confuse the two. It is a very similar concept of empowering the developers to run things on production.” Mordo, who joined Wix in 2010, has seen its growth from a simple website builder into one of the internet’s biggest players and Israel’s largest companies.

Developer sabotages own npm module prompting open-source supply chain security questions

"Even if the deliberate and dangerous act of maintainer RIAEvangelist will be perceived by some as a legitimate act of protest, how does that reflect on the maintainer’s future reputation and stake in the developer community?," Liran Tal, Snyk's director of developer advocacy, said. "Would this maintainer ever be trusted again to not follow up on future acts in such or even more aggressive actions for any projects they participate in?" "When it comes to this particular issue of trust, I believe the best way for it to be handled is with proper software supply chain hygiene," Brian Fox, CTO of supply chain security firm Sonatype, tells CSO. "When you’re choosing what open-source projects to use, you need to look at the maintainers." Fox recommends exclusively choosing code from projects backed by foundations such as the Apache Foundation, which don't have projects with just one developer or maintainer. With foundations there is some oversight, group reviews and governance that's more likely to catch this type of abuse before it's released to the world.

Never-Mind the Gap: It Isn't Skills We're Short Of, It's Common Sense

Every person working in cybersecurity today started somewhere, and the amount of learning material currently available surpasses what was around when many of us started out. Enticing the right person to one of these outlets can spark a flame that can burn through an organization faster than anything else. When you ignite a passion, you ignite something deeper, and aiding these individuals in manifesting their talent can only benefit your organization. There needs to be a new narrative that cybersecurity is not only about having technical prowess because many roles don’t require a high level of technical expertise. These positions are a great stepping stone into the industry for those who lack the core technological know-how you might expect when you think of a “cybersecurity expert” and provide valuable insights and input to the security teams. Organizations love silos, but what happens when larger strategies overlap silos, technologies and outcomes?

Explore 9 essential elements of network security

Advanced network threat prevention products perform signatureless malware discovery at the network layer to detect cyber threats and attacks that employ advanced malware and persistent remote access. These products employ heuristics, code analysis, statistical analysis, emulation and machine learning to flag and sandbox suspicious files. Sandboxing -- the isolation of a file from the network so it can execute without affecting other resources -- helps identify malware based on its behavior rather than through fingerprinting. ... DDoS mitigation is a set of hardening techniques, processes and tools that enable a network, information system or IT environment to resist or mitigate the effect of DDoS attacks on networks. DDoS mitigation activities typically require analysis of the underlying system, network or environment for known and unknown security vulnerabilities targeted in a DDoS attack. This also requires identification of what normal conditions are -- through traffic analysis -- and the ability to identify incoming traffic to separate human traffic from humanlike bots and hijacked web browsers.

Preparing for the quantum-safe encryption future

Quantum-safe encryption is key to addressing the quantum-based cybersecurity threats of the future, and Woodward predicts that a NIST candidate will eventually emerge as the new standard used to protect virtually all communications flowing over the internet, including browsers using TLS. “Google has already tried experiments with this using a scheme called New Hope in Chrome,” he says. Post-Quantum’s own encryption algorithm, NTS-KEM (now known as Classic McEliece), is the only remaining finalist in the code-based NIST competition. “Many have waited for NIST’s standard to emerge before taking action on quantum encryption, but the reality now is that this could be closer than people think, and the latest indication is that it could be in the next month,” says Cheng. Very soon, companies will need to start upgrading their cryptographic infrastructure to integrate these new algorithms, which could take over a decade, he says. “Microsoft’s Brian LaMacchia, one of the most respected cryptographers in the world, has summarized succinctly that quantum migration will be a much bigger challenge than past Windows updates.”

The value of DevEx: how starting with developers can boost customer experience

The benefits of building a great customer experience are clear, but when identifying how to actually go about curating a world-class customer experience, things become more complicated. Many start by looking at end-user features and technologies such as chatbots, conversational AI, omnichannel messaging, and more as a way to kickstart CX efforts. Yet while all of these can, and should improve customer experience, they are not addressing customer experience at its core. The reality is, in order to truly build a transformational customer experience, you must first start with providing a better experience for those who are responsible for building your products, services, and the experiences customers have when interacting with them. You must start with your developers. Developer experience is customer experience. ... Creating a great developer experience means creating a frictionless developer experience. If developers can spend less time figuring out tools, processes, and procedures, they can spend more time innovating and building modern features and experiences for their end-users.

Why machine identities matter (and how to use them)

It is well accepted that reliance on perimeter network security, shared accounts, or static credentials such as passwords, are anti-patterns. Instead of relying on shared accounts, modern human-to-machine access is now performed using human identities via SSO. Instead of relying on network perimeter, a zero-trust approach is preferred. These innovations have not yet made their way into the world of machine-to-machine communication. Machines continue to rely on the static credentials – an equivalent of a password called the API key. Machines often rely on perimeter security as well, with microservices connecting to databases without encryption, authentication, authorization, or audit. There is an emerging consensus that password-based authentication and authorization for humans is woefully inadequate to secure our critical digital infrastructure. As a result, organizations are increasingly implementing “passwordless” solutions for their employees that rely on integration with SSO providers and leverage popular, secure, and widely available hardware-based solutions like Apple Touch ID and Face ID for access.

Quote for the day:

"Confident and courageous leaders have no problems pointing out their own weaknesses and ignorance." -- Thom S. Rainer

Daily Tech Digest - March 20, 2022

Can Open Source Sustain Itself without Losing Its Soul?

It’s clear that businesses will need to play more of a role in open source. As Valsorda noted in the same blog post, “open source sustainability and supply chain security are on everyone’s slide decks, blogs, and press releases. Big companies desperately need the open source ecosystem to professionalize.” Amanda Brock, CEO of OpenUK, a not-for-profit that supports the use of open technologies, concurred: “We need to know not only that we have the best software that can be produced — which collaborative and diverse globally produced open source software is — but also that appropriate funding has been provided to ensure that those building all this essential software are able to maintain and support it being secure.” Brock cited a number of examples of where this is happening in the U.K.; for example, she pointed to the work of the Energy Digitalisation Taskforce. That governmental group “suggested that the spine of the digitalized energy sector should be built on open source software. The National Health Service in the U.K. also now has an open source software-first approach for code it creates that it is increasingly trying to live by.”

Using the Business Model Canvas in Enterprise Architecture

The Business Model Canvas brings together nine key elements of a business model, making it possible to observe and describe the relationships of those nine elements to each other. As architects, plotting the relationship of one element to other elements is familiar territory. We align patterns, find gaps, map gives and gets, and understand strategy by assessing the relationships of the critical systems in an architecture landscape. The Business Model Canvas is yet another tool to help us convey understanding. Many enterprise architects hang their hats on “People, Process, Technology,” the popular PPT framework popularized in the 1990s. The roots of PPT extend further back, to the 1960s and the Diamond Model from Harold Leavitt. PPT and the Diamond Model are useful, for certain, but the canvas offers something that every enterprise architect should value. In the aggregate, the nine blocks tell the story of the organization, how it goes to market and aims to create, deliver, and capture value...”

How Enterprise Architecture Helps Reduce IT Costs

Easier said than done with the traditional process of manual follow-ups hampered by inconsistent documentation often scattered across many teams. The issue with documentation also often means that maintenance efforts are duplicated, resources that could have been better deployed elsewhere. The result is the equivalent of around 3 hours of a dedicated employee’s focus per application per year spent on documentation, governance, and maintenance. Not so for the organization that has a digital-native EA platform that leverages your data to enable scalability and automation in workflows and messaging so you can reach out to the most relevant people in your organization when it's most needed. Features like these can save an immense amount of time otherwise spent identifying the right people to talk to and when to reach out to them, making your enterprise architecture the single source of truth and a solid foundation for effective governance. The result is a reduction of approximately a third of the time usually needed to achieve this.

AI drug algorithms can be flipped to invent bioweapons

Now consider an AI algorithm that can generate deadly biochemicals that behave like VX but are made up of entirely non-regulated compounds. "We didn't do this but it is quite possible for someone to take one of these models and use it as an input to the generative model, and now say 'I want something that is toxic', 'I want something that does not use the current precursors on the watch list'. And it generates something that's in that range. We didn't want to go that extra step. But there's no logical reason why you couldn't do that," Urbina added. If it's not possible to achieve this, you're back to square one. As veteran drug chemist Derek Lowe put it: "I'm not all that worried about new nerve agents ... I'm not sure that anyone needs to deploy a new compound in order to wreak havoc – they can save themselves a lot of trouble by just making Sarin or VX, God help us." There is no strict regulation on the machine-learning-powered synthesis of new chemical molecules.

A Primer on Proxies

In HTTP/2, each request and response is sent on a different stream. To support this, HTTP/2 defines frames that contain the stream identifier that they are associated with. Requests and responses are composed of HEADERS and DATA frames which contain HTTP header fields and HTTP content, respectively. Frames can be large. When they are sent on the wire they might span multiple TLS records or TCP segments. Side note: the HTTP WG has been working on a new revision of the document that defines HTTP semantics that are common to all HTTP versions. The terms message, header fields, and content all come from this description. HTTP/2 concurrency allows applications to read and write multiple objects at different rates, which can improve HTTP application performance, such as web browsing. HTTP/1.1 traditionally dealt with this concurrency by opening multiple TCP connections in parallel and striping requests across these connections. In contrast, HTTP/2 multiplexes frames belonging to different streams onto the single byte stream provided by one TCP connection.

Thinking Strategically Will Help You Get Ahead and Stay Ahead

Create mental space for new ideas to kick-in. Without the quiet time to sit with your thoughts, facing the uncomfortable silence, and letting your mind wander away, you cannot draw useful connections. It will not happen the first time around and probably not even the second time. But if you are persistent in your efforts, without digital and other distractions of daily life, you will start to notice new patterns of thinking. New ideas that you never thought about before will start to surface. Another great strategy is to not restrict yourself to knowledge within your current scope of work. Spend time learning about your business and industry. Meet with other functions within your organization to understand how they operate, what their challenges are and how they make decisions. All of this knowledge will enable you to apply different mental models to connect ideas from different domains thereby expanding your circle of competence and building your strategic thinking skills. Remember, building strategic thinking skills involves looking beyond the obvious and now to prodding and shaping the uncertain future.

Microsoft Azure reveals a key breakthrough toward scaling quantum computing

“It’s never been done before, and until now it was never certain that it could be done. And now it’s like yes, here’s this ultimate validation that we’re on the right path,” she said. What have researchers achieved? They have developed devices capable of inducing a topological phase of matter bookended by a pair of Majorana zero modes, types of quantum excitations first theorized about in 1937 that don’t normally exist in nature. Majorana zero modes are crucial to protecting quantum information, enabling reliable computation, and producing a unique type of qubit, called a topological qubit, which Microsoft’s quantum machine will use to store and compute information. A quantum computer built with these qubits will likely be more stable than machines built with other types of known qubits and may help solve some of the problems which currently baffle classical computers. “Figuring out how to feed the world or cure it of climate change will require discoveries or optimization of molecules that simply can’t be done by today’s classical computers

Lensless Camera Captures Cellular-Level 3D Details

At the sensor, light that comes through the mask appears as a point spread function, a pair of blurry blobs that seems useless but is actually key to acquiring details about objects below the diffraction limit that are too small for many microscopes to see. The blobs’ sizes, shapes, and distances from each other indicate how far the subject is from the focal plane. Software reinterprets the data into an image that can be refocused at will. The researchers first tested the device by capturing cellular structures in a lily of the valley, and then calcium activity in small jellyfish-like hydra. The team then monitored a running rodent, attaching the device to the rodent’s skull and then setting the animal down on a wheel. Data showed fluorescent-tagged neurons in a region of the animal’s brain, connecting activity in the motor cortex with motion and resolving blood vessels as small as 10 µm in diameter. In collaboration with Rebecca Richards-Kortum and research scientist Jennifer Carns from Rice Bioengineering, the team identified vascular imaging as a potential clinical application of the Bio-FlatScope.

Handling Out-of-Order Data in Real-Time Analytics Applications

The solution is simple and elegant: a mutable cloud native real-time analytics database. Late-arriving events are simply written to the portions of the database they would have been if they had arrived on time in the first place. In the case of Rockset, a real-time analytics database that I helped create, individual fields in a data record can be natively updated, overwritten or deleted. There is no need for expensive and slow copy-on-writes, a la Apache Druid, or kludgy segregated dynamic partitions. A mutable real-time analytics database provides high raw data ingestion speeds, the native ability to update and backfill records with out-of-order data, all without creating additional cost, data error risk or work for developers and data engineers. This supports the mission-critical real-time analytics required by today’s data-driven disruptors. In future blog posts, I’ll describe other must-have features of real-time analytics databases such as bursty data traffic and complex queries.

CISOs face 'perfect storm' of ransomware and state-supported cybercrime

With not just ransomware gangs raiding network after network, but nation states consciously turning a blind eye to it, today's chief information security officers are caught in a "perfect storm," says Cybereason CSO Sam Curry. "There's this marriage right now of financially motivated cybercrime that can have a critical infrastructure and economic impact," Curry said during a CISO roundtable hosted by his endpoint security shop. "And there are some nation states that do what we call state-ignored sanctioning," he continued, using Russia-based REvil and Conti ransomware groups as examples of criminal operations that benefit from their home governments looking the other way. "You get the umbrella of sovereignty, and you get the free license to be a privateer in essence," Curry said. "It's not just an economic threat. It's not just a geopolitical threat. It's a perfect storm." It's probably not a huge surprise to anyone that destructive cyberattacks keep CISOs awake at night.

Quote for the day:

"Leadership means forming a team and working toward common objectives that are tied to time, metrics, and resources." -- Russel Honore

Daily Tech Digest - March 19, 2022

How Radical API Design Changed the Way We Access Databases

One of the early design decisions we made at MongoDB was to focus on interaction with the database using a pure object-based API. There would be no query language. Instead, every request to the database would be described as a set of objects that were intended to be constructed by a computer as much as by a human (in many cases, more often by a computer). This approach allowed programmers to treat a complex query the same as creating a piece of imperative code. Want to retrieve all the animals in your database that have exactly two legs? Then create an object, set a member, “legs,” to two and query the database for matching objects. What you get back is an array of objects. This model extends to even the most complex operations. This approach enabled developers to build database queries as code — it was a leap from a query language mindset to a programmer’s mindset. This would significantly speed up development time and improve query performance. This API approach to database operations helped kickstart MongoDB’s rapid adoption and growth in our early years.

Software Techniques for Lemmings

The performance of a system with thousands of threads will be far from satisfying. Threads take time to create and schedule, and their stacks consume a lot of memory unless their sizes are engineered, which won't be the case in a system that spawns them mindlessly. We have a little job to do? Let's fork a thread, call join, and let it do the work. This was popular enough before the advent of <thread> in C++11, but <thread> did nothing to temper it. I don't see <thread> as being useful for anything other than toy systems, though it could be used as a base class to which many other capabilities would then be added. Even apart from these Thread Per Whatever designs, some systems overuse threads because it's their only encapsulation mechanism. They're not very object-oriented and lack anything that resembles an application framework. So each developer creates their own little world by writing a new thread to perform a new function. The main reason for writing a new thread should be to avoid complicating the thread loop of an existing thread.

Software development is changing again. These are the skills companies are looking for

The new normal means developers will work in a variety of ways with a broad church of partners. As well as internal developers, Verastar uses outsourced capability and works closely with some key digital transformation partners, including Salesforce. "We have a very hybrid team. People need to learn to work together and across different teams. We bring everything together with Agile and sprints. Working in a virtual world means it's very rare you're all sat together in the same office now," says Clarkson, "And that's certainly the case with us. Although we've got a centre in Sale, Manchester, we've got developers that work remote, our partner works remotely, and there'll be based either nearshore or offshore as well, so you can end up with quite a wide team." Dal Virdi, IT director at legal firm Shakespeare Martineau, is another tech chief who recognises that a successful modern IT team relies on a hybrid blend of internal developers and external specialists. Virdi recognised about 18 months ago that his firm's ongoing digital transformation strategy, and the way in which the business was introducing a broad range of technologies, meant they didn't need to have internal specialists focused on one language or platform.

Concept drift vs data drift in machine learning

Concept and data drift are a response to statistical changes in the data. Hence, approaches monitoring the model’s statistical properties, predictions, and their correlation with other factors help identify the drift. But several steps need to be taken post identification to ensure the model is accurate. Two popular approaches are online machine learning and periodic retraininG. Online learning involves updating the model to learn in real-time. This allows the data to be sequential. This allows the models to take batches of samples simultaneously and optimise the batch of data in one go. Online machine learning allows us to update learners in real-time. In online learning the models are learned in a setting where it takes the batches of samples with the time and the learner optimises the batch of data in one go. Since these models work on the fixed parameters of a data stream, they must retain the new patterns of the data. Periodic retraining of the model is also critical. Since an ML model degrades every three months on average, retraining them on regular intervals can stop drift in its tracks.

The rise of zero-touch IT

First, zero-touch IT is a way to free your people from maintenance tasks, and up-level your ops team to be more strategic. You’ve noticed the Great Resignation — IT talent never grew on trees, and now there’s an epic drought. Your team’s time and abilities shouldn’t be wasted on what can be automated. Second, IT serves demanding customers. Corporate users have grown less tolerant about waiting for IT to ride to their rescue, and they have sharper expectations. After all, if they can find and load a CRM app on their phone in one minute, why can’t your technology experts provide them with a new company CRM in, say, 10 minutes? Users onboard, request privileges and carry out operations in different time zones. Automation doesn’t sleep, making it a good fit with asynchronous workforces. Third, zero-touch IT, when properly implemented, reduces mistakes caused by fatigue and overload. One distracted IT staffer can easily grant unauthorized data privileges to an outside contractor, with dire consequences. There are options for zero-touch IT; independently constructed workflows can be automated, but this can produce a spaghetti of disparate procedures that behave differently and produce confusion.

Seeing the Unseen: A New Lens on Visibility at Work

In some sense, “seeing what you want to see” means seeing what you already believe. That’s fine if you seek consensus, but it’s not a good formula for innovative thinking. Pressure is necessary to effect real change. This often involves challenging the status quo and stepping out of what has not been recognized as a fixed perspective. Surrounding yourself with people with similar experiences, beliefs, and perceptions about the world can foreclose on the possibility of thinking differently. On teams, shared assumptions can result in people coming up with the same or similar solutions to a set of challenges. While these solutions may help people like you, they may fail to address the needs of others who are not. Take, for example, the failure to optimize early smartphone cameras for darker skin tones, or how facial recognition technologies identify White faces with a higher degree of accuracy compared with those of people of color. Technological biases of this kind ensure that some people are seen, while others remain unseen or perhaps seen in a very unfavorable light.

Moore’s Law: Scientists Just Made a Graphene Transistor Gate the Width of an Atom

To be clear, the work is a proof of concept: The researchers haven’t meaningfully scaled the approach. Fabricating a handful of transistors isn’t the same as manufacturing billions on a chip and flawlessly making billions of those chips for use in laptops and smartphones. Ren also points out that 2D materials, like molybdenum disulfide, are still pricey and manufacturing high-quality stuff at scale is a challenge. New technologies like gate-all-around silicon transistors are more likely to make their way into your laptop or phone in the next few years. Also, it’s worth noting that the upshot of Moore’s Law—that computers will continue to get more powerful and cheaper at an exponential rate—can also be driven by software tweaks or architecture changes, like using the third dimension to stack components on top of one another. Still, the research does explore and better define the outer reaches of miniaturization, perhaps setting a lower bound that may not be broken for years. It also demonstrates a clever way to exploit the most desirable properties of 2D materials in chips.

New explanation emerges for robust superconductivity in three-layer graphene

Graphene is an atomically-thin sheet of carbon atoms arranged in a 2D hexagonal lattice. When two sheets of graphene are placed on top of each other and slightly misaligned, the positions of the atoms form a moiré pattern or “stretched” superlattice that dramatically changes the interactions between their electrons. The degree of misalignment is very important: in 2018, researchers at the Massachusetts Institute of Technology (MIT) discovered that at a “magic” angle of 1.1°, the material switches from being an insulator to a superconductor. The explanation for this is behaviour is that, as is the case for conventional superconductors, electrons with opposite spins pair up to form “Cooper pairs” that then move though the material without any resistance below a certain critical transition temperature Tc (in this case, 1.7 K). Three years later, the Harvard experimentalists observed something similar happening in (rhombohedral) trilayer graphene, which they made by stacking three sheets of the material at small twist angles with opposite signs. In their work, the twist angle between the top and middle layer was 1.5° while that between the middle and bottom layer was -1.5°.

Intelligent Diagramming Makes Sense of Cloud Complexities

One major issue IT leaders face is simply knowing what components their cloud environments contain. At any point, a company can have SaaS apps, databases, containers and workloads that sometimes spread across multiple cloud providers, as well as on-premises systems. The first step in mitigating these cloud complexities is to know what you have — in other words, to take inventory. Depending on which cloud providers you use — Amazon Web Services (AWS), Microsoft Azure or Google Cloud Platform (GCP) — you may need a variety of inventory tools. Learn what your provider’s default management console offers. In some cases, you may need to write scripts in order to pull data on every resource type from every corner of your cloud environment (different regions, for instance, might require separate queries). If your environment is complex enough, management consoles and scripts won’t cut it. Automated inventory tools can make it much easier to identify every component of your cloud environment. But there are still opportunities to simplify how that inventory is pulled, viewed and understood.

MLOps for Enterprise AI

There was a time when building machine learning (ML) models and taking them to production was a challenge. There were challenges with sourcing and storing quality data, unstable models, dependency on IT systems, finding the right talent with a mix of Artificial Intelligence Markup Language (AIML) and IT skills, and much more. However, times have changed. Though some of these issues still exist, there has been an increase in the use of ML models amongst enterprises. Organizations have started to see the benefits of ML models, and they continue their investments to bridge the gap and grow the use of AIML. Nevertheless, the growth of ML models in production leads to new challenges like how to manage and maintain the ML assets and monitor the models. Since 2019, there has been a surge in incorporating machine learning models into organizations, and MLOps has started to emerge as a new trending keyword. Although, it’s not just a trend; it’s a necessary element in the complete AIML ecosystem.

Quote for the day:

"It's not about how smart you are--it's about capturing minds." -- Richie Norton

Daily Tech Digest - March 18, 2022

Defining the Possible Approaches to Optimum Metadata Management

Metadata has been the focus of a lot of recent work, both in academia and industry. As more and more electronic data is generated, stored, and managed, metadata generation, storage, and management promise to improve the utilization of that data. Data and metadata are intrinsically linked, hence the concept can be found in any possible application area and can take numerous forms depending on its application context. However, it is found that metadata is often employed in scientific computations just for the initial data selection; at the most, metadata about query results are recovered after the query has been successfully executed and correlated. As a result, throughout the query processing procedure, a vast amount of information that may be useful for analyzing query results is not utilized. Thus, the data need "refinements". There are two distinct definitions of "refinements". The first is the addition of qualifiers that clarify or enlarge an element's meaning. While such modifications may be necessary or even necessary for a particular metadata application, for the sake of interoperability, the values of such elements can be regarded as subtypes of a broader element.

Data Contracts — ensure robustness in your data mesh architecture

In cases where many applications are coupled to each other, a cascading effect sometimes can be seen. Even a small change to a single application can lead to the adjustment of many applications at the same time. Therefore, many architects and software engineers avoid building coupled architectures. Data contracts are positioned to be the solution to this technical problem. A data contract guarantees interface compatibility and includes the terms of service and service level agreement (SLA). The terms of service describe how the data can be used, for example, only for development, testing, or production. The SLA typically also describes the quality of data delivery and interface. It also might include uptime, error rates, and availability, as well deprecation, a roadmap, and version numbers. Data contracts are in many cases part of a metadata-driven ingestion framework. They’re stored as metadata records, for example, in a centrally managed metastore, and play an important role for data pipeline execution, validation of data types, schemas, interoperability standards, protocol versions, defaulting rules on missing data, and so on. Therefore, data contracts include a lot of technical metadata.

What’s behind the cloud talent crisis — and how to fix it

The problem is that there aren’t enough experienced, trained engineers necessary to meet that need. And even folks who have been in the thick of cloud technology from the start are finding themselves rushing to stay abreast of the evolution of cloud technology, ensuring that they’re up on the newest skills and the latest changes. Compounding the issue, it’s an employee’s market, where job seekers are spoiled for choice by an endless number of opportunities. Companies are finding themselves in fierce competition, fishing during a drought in a pool that keeps shrinking. “It’s going to require so many more experienced, trained engineers than we currently have,” said Cloudbusing host Jez Ward during the Cloud Trends 2022 thought leadership podcast series at ReInvent. “We’re taking it exceptionally seriously, and we probably have it as our number one risk that we’re managing. As we talk to some of our partner organizations, they see this in the same way.” Cloudbusting podcast hosts Jez Ward and Dave Chapman were joined by Tara Tapper, chief people officer at Cloudreach and Holly Norman, Cloudreach’s head of AWS marketing to talk about what’s behind the tech crisis, and how companies can meet this challenge.

Meta AI’s Sparse All-MLP Model Doubles Training Efficiency Compared to Transformers

Transformer architectures have established the state-of-the-art on natural language processing (NLP) and many computer vision tasks, and recent research has shown that All-MLP (multi-layer perceptron) architectures also have strong potential in these areas. However, although newly proposed MLP models such as gMLP (Liu et al., 2021a) can match transformers in language modelling perplexity, they still lag in downstream performance. In the new paper Efficient Language Modeling with Sparse all-MLP, a research team from Meta AI and the State University of New York at Buffalo extends the gMLP model with sparsely activated conditional computation using mixture-of-experts (MoE) techniques. Their resulting sMLP sparsely-activated all-MLP architecture boosts the performance of all-MLPs in large-scale NLP pretraining, achieving training efficiency improvements of up to 2x compared to transformer-based mixture-of-experts (MoE) architectures, transformers, and gMLP.

How to build a better CIO-CMO relationship

The CIO should be regularly and actively engaging the CMO for assistance in "telling the story" of new technology investments. For example, they should share how the new HR system not only provided a good ROI and TCO, but made employees' lives easier and better. Technology vendors are well aware of the value of having their technology leaders "tell the story." The deputy CIO of Zoom spends a considerable amount of time evangelizing about the company and its products -- and is highly effective at it. Spotify has a well-regarded series of videos about how its DevOps culture helps it succeed. CIOs at non-technology companies -- or more accurately, at companies that produce products other than hardware, software and cloud services -- would do well to take a page from the technology CIO's playbook. CMOs and their teams can assist CIOs and their teams with developing a campaign to market a new technology implementation. They can ensure the campaign captures the appropriate attention of the desired constituencies, up to and including developing success metrics, so CIOs are able to assess how effective they're being.

Matter smart home standard delayed until fall 2022

The CSA is also allowing more time for the build and verification of a larger than expected number of platforms (OS’s and chipsets), which it hopes will see Matter launch with a healthy slate of compatible Matter devices, apps, and ecosystems. This need arose over the last year based on activity seen on the project’s Github repository. More than 16 platforms, including OS platforms like Linux, Darwin, Android, Tizen, and Zephyr, and chipset platforms from Infineon, Silicon Labs, TI, NXP, Nordic, Espressif Systems and Synaptics will now support Matter. “We had thought there would be four or five platforms, but it’s now more than 16,” says Mindala-Freeman. “The volume at which component and platform providers have gravitated to Matter has been tremendous.” The knock-on effect of these SDK changes is that the CSA needs to give its 50 member companies who are currently developing Matter-capable products another chance to test those devices before they go through the Matter certification process. The CSA also shared details of that initial certification process with The Verge. Following a specification validation event (SVE) this summer

PyTorch Geometric vs Deep Graph Library

Arguably the most exciting accomplishment of deep learning with graphs so far has been the development of AlphaFold and AlphaFold2 by DeepMind, a project that has made major strides in solving the protein structure prediction problem, a longstanding grand challenge of structural biology. With myriad important applications in drug discovery, social networks, basic biology, and many other areas, a number of open-source libraries have been developed for working with graph neural networks. Many of these are mature enough to use in production or research, so how can you go about choosing which library to use when embarking on a new project? Various factors can contribute to the choice of GNN library for a given project. Not least of all is the compatibility with you and your team’s existing expertise: if you are primarily a PyTorch shop it would make sense to give special consideration to PyTorch Geometric, although you might also be interested in using the Deep Graph Library with PyTorch as the backend (DGL can also use TensorFlow as a backend).

The No-Code Approach to Deploying Deep Learning Models on Intel® Hardware

Deep learning has two broad phases: training and inference. During training, computers build artificial neural network models by analyzing thousands of inputs—images, sentences, sounds—and guessing at their meaning. A feedback loop tells the machine if the guesses are right or wrong. This process repeats thousands of times, creating a multilayered network of algorithms. Once the network reaches its target accuracy, it can be frozen and exported as a trained model. During deep learning inference, a device compares incoming data with a trained model and infers what the data means. For example, a smart camera compares video frames against a deep learning model for object detection. It then infers that one shape is a cat, another is a dog, a third is a car, and so on. During inference, the device isn’t learning; it’s recognizing and interpreting the data it receives. There are many popular frameworks—like TensorFlow PyTorch, MXNet, PaddlePaddle—and a multitude of deep learning topologies and trained models. Each framework and model has its own syntax, layers, and algorithms.

Operational resilience is much more than cyber security

To a Chief Information Officer, for example, an IT department can’t be considered operationally resilient without the accurate, actionable data necessary to keep essential business services running. To a Chief Financial Officer, meanwhile, resilience involves maintaining strong financial reporting systems in order to maintain vigilance over spend and savings. This list could run on and on, but while resilience manifests itself differently to different departments, no aspect of an enterprise organisation exists in a vacuum. True resilience involves understanding connections between different aspects of a business – and the dependencies between the various facets of its infrastructure. To understand the connections and dependencies between business services, customer journeys, business applications, and cloud / legacy infrastructure, and so on, large organisations need to invest in tools like configuration management databases (CMDBs). With the visibility and knowledge that a CMDB provides, organisations can strengthen their resilience by understanding and anticipating how disruptions to one part of their infrastructure will impact the rest

Deploying AI With an Event-Driven Platform

There are a number of crucial features required in an event-driven AI platform to provide real-time access to models for all users. The platform needs to offer self-service analytics to non-developers and citizen data scientists. These users must be able to access all models and any data required for training, context, or lookup. The platform also needs to support as many different tools, technologies, notebooks, and systems as possible because users need to access everything by as many channels and options as possible. Further, almost all end users will require access to other types of data (e.g., customer addresses, sales data, email addresses) to augment the results of these AI model executions. Therefore, the platform should be able to join our model classification data with live streams from different event-oriented data sources, such as Twitter and Weather Feeds. This is why my first choice for building an AI platform is to utilize a streaming platform such as Apache Pulsar. Pulsar is an open-source distributed streaming and publish/subscribe messaging platform that allows your machine learning applications to interact with a multitude of application types

Quote for the day:

"As a leader, you set the tone for your entire team. If you have a positive attitude, your team will achieve much more." -- Colin Powell

Daily Tech Digest - March 17, 2022

10 hard truths of change management

“We do a terrible job of understanding and navigating the emotional journey of change,” says Wanda Wallace, leadership coach and managing partner of Leadership Forum. “This is where leaders need to get smart.” While some people may welcome it, “change is also about loss — loss of my current capability while I learn new ones, loss of who I go to to solve a problem, loss of established ways of doing things,” says Wallace. “Even if someone loves the rationale for the change, they still have to grieve the loss of what was and the loss of the ease of knowing what to do even if it wasn’t efficient.” It also involves fear. “This is usually labelled as ‘resistance,’ but I find many times it is fear of not being able to learn the new skills, not being as valued after the change, not feeling competent, not being at the center of activity the way they were before the change,” says Wallace. She advises IT leaders to name those fears, acknowledge them, and talk about the journey of learning — not just from the C-suite, but at the manager level.

Feature Engineering for Machine Learning (1/3)

During EDA, one of the first steps to undertake should be to check for and remove constant features. But surely the model can discover that on its own? Yes, and no. Consider a Linear Regression model where a non-zero weight has been initialized to a constant feature. This term then serves as a secondary ‘bias’ term and seems harmless enough … but not if that ‘constant’ term was constant only in our training data, and (unbeknownst to us) later takes on a different value in our production/test data. Another thing to be on the lookout for is duplicated features. This may not be blatantly obvious when it comes to categorical data, as it might manifest as different labels names being assigned to the same attribute across different columns, e.g. One feature uses ‘XYZ’ to denote a categorical class that another feature denotes as ‘ABC’, perhaps due to the columns being culled from different databases or departments. pd.factorize() can help identify if two features are synonymous.

OpenAI’s Chief Scientist Claimed AI May Be Conscious — and Kicked Off a Furious Debate

Consciousness is at times mentioned in conversations about AI. Although inseparable from intelligence in the case of humans, it isn’t clear whether that’d be the case for machines. Those who dislike AI anthropomorphization often attack the notion of “machine intelligence.” Consciousness, being even more abstract, usually comes off worse. And rightly so, as consciousness — not unlike intelligence — is a fuzzy concept that lives in the blurred intersection of philosophy and the cognitive sciences. The origins of the modern concept can be traced back to John Locke’s work. He described it as “the perception of what passes in a man’s own mind.” However, it has proved to be an elusive concept. There are multiple models and hypotheses on consciousness that have gotten more or less interest throughout the years but the scientific community hasn’t yet arrived at a consensual definition. For instance, panpsychism — which comes to mind reading Sutskever’s thoughts — is a singular idea that got some traction recently.

Cryptographic Truth: The Future of Trust-Minimized Computing and Record-Keeping

The focus of this article so far has been on how blockchains combine cryptography and game theory to consistently form honest consensus—the truth—regarding the validity of internal transactions. However, how can events happening outside a blockchain be reliably verified? Enter Chainlink. Chainlink is a decentralized oracle network designed to generate truth about external data and off-chain computation. In this sense, Chainlink generates truth from largely non-deterministic environments. Determinism is a feature of computation where a specific input will always lead to a specific output, i.e., code will execute exactly as written. Decentralized blockchains are said to be deterministic because they employ trust-minimization techniques that remove or lower to a near statistical impossibility any variables that could inhibit internal transaction submission, execution, and verification. The challenge with non-deterministic environments is that the truth can be subjective, difficult to obtain, or expensive to verify.

Red Hat cloud leader defects to service mesh upstart

When service mesh first came out, Kubernetes was in such a fervor -- it had been three or four years, so people had gone through the high of it, and saw the potential, and then there was a little bit of a lull in the hype when it hadn't really exploded in terms of usage. So when service mesh came out, for certain people, it was just like, 'Oh, cool, here's the new thing.' And it was new, 1.0 sort of stuff. If you fast forward, now, four years from that, Kubernetes is now at the point where it's super stable, it's being released less often. You have a lot more companies who are deploying Kubernetes [that are] starting to build new applications. We saw a lot of companies [during] the pandemic build new applications at a faster rate than they did before. [Solo.io customer] Chick-fil-A is an example -- at their thousands of stores as a franchise, before, most people parked their car, went in the store, then came out. Nowadays, the first interaction everybody has with the store is, 'I go on the app, I place my order, I get my loyalty points.'

Ceramic’s Web3 Composability Resurrects Web 2.0 Mashups

One of the more interesting composability projects to emerge in Web3 is Ceramic, which calls itself “a decentralized data network that brings unlimited data composability to Web3 applications.” It’s basically a data conduit between dApps (decentralized applications), blockchains, and the various flavors of decentralized storage. The idea is that a dApp developer can use Ceramic to manage “streams” of data, which can then be re-used or re-purposed by other dApps via an open API. Unlike most blockchains, Ceramic is also able to easily scale. A blog post on the Ceramic website explains that “each Ceramic node acts as an individual execution environment for performing computations and validating transactions on streams – there is no global ledger.” Also noteworthy about Ceramic is its use of DIDs (Decentralized Identifiers), a W3C web standard for authentication that I wrote about last year. The DID standard allows Ceramic users to transact with streams using decentralized identities.

Uncovering Trickbot’s use of IoT devices in command-and-control infrastructure

A significant part of its evolution also includes making its attacks and infrastructure more durable against detection, including continuously improving its persistence capabilities, evading researchers and reverse engineering, and finding new ways to maintain the stability of its command-and-control (C2) framework. This continuous evolution has seen Trickbot expand its reach from computers to Internet of Things (IoT) devices such as routers, with the malware updating its C2 infrastructure to utilize MikroTik devices and modules. MikroTik routers are widely used around the world across different industries. By using MikroTik routers as proxy servers for its C2 servers and redirecting the traffic through non-standard ports, Trickbot adds another persistence layer that helps malicious IPs evade detection by standard security systems. The Microsoft Defender for IoT research team has recently discovered the exact method through which MikroTik devices are used in Trickbot’s C2 infrastructure.

Why (and How) You Should Manage JSON with SQL

JSON documents can be large and contain values spread across tables in your relational database. This can make creating and consuming these APIs challenging because you may need to combine data from several tables to form a response. However, when consuming a service API, you have the opposite problem, that is, splitting a large (aka massive) JSON document into appropriate tables. Using custom-written code to map these elements in the application tier is tedious. Such custom code, unless super-carefully constructed by someone who knows how databases work, can also lead to many roundtrips to the database service, slowing the application to a crawl and potentially consuming excess bandwidth. ... The free-form nature of JSON is both its biggest strength and its biggest weakness. Once you start storing JSON documents in your database, it’s easy to lose track of what their structure is. The only way to know the structure of a document is to query its attributes. The JSON Data Guide is a function that solves this problem for you.

The new CEO: Chief Empathy Officer

Top leadership has historically been responsible only for numbers and the bottom line. Profitability and utilization numbers are still important, but they generally do not motivate employees outside of the leadership, shareholders, and the board. Similarly, the feelings and well-being of the staff have long been the primary responsibility of the HR team. This no longer works in a company that is growing sustainably. The “Great Resignation” indicates that well-being has taken on a new level of critical importance. Arguably, a key contributor to this phenomenon has been employees’ lack of emotional connection to their employers. How can leaders help people feel their connection to the organization when they are physically separated? Empathy is the answer. The understanding of the empathetic leader bridges gaps and is a key component in communicating the personal role people have in the strategy of the company. In short, empathy is not just a tactic. Genuine concern for people is the ultimate business strategy for growth.

Four key considerations when moving from legacy to cloud-native

The Cloud Native Computing Foundation (CNCF) defines it as “scalable applications in modern, dynamic environments such as public, private, and hybrid clouds” – characterised by “containers, service meshes, microservices, immutable infrastructure, and declarative APIs.” However, cloud-native computing is more than just running software or infrastructure on the cloud, as cloud-only services still requires constant tweaking whenever you deploy applications. With cloud-native technology however, your applications run on stateless servers and immutable infrastructure that doesn’t require constant modification. According to a 2020 Cloud Native Foundation Survey, 51% of respondents stated improved scalability, shorter deployment time, and consistent availability as the top benefits for using cloud-native technology in their projects. Furthermore, Gartner claims more than 45% of IT spending will be reallocated from legacy systems to cloud solutions by 2024.

Quote for the day:

"Leaders are people who believe so passionately that they can seduce other people into sharing their dream." -- Warren G. Bennis,

Pages