Oct
19
2020
--

5 Things Developers Should Know Before Deploying MongoDB

Developers Should Know Before Deploying MongoDB

Developers Should Know Before Deploying MongoDBMongoDB is one of the most popular databases and is one of the easiest NoSQL databases to set up. Oftentimes, developers want a quick environment to just test out an idea they have for an application or to try and figure out a good data model for their data without waiting for their Operations team to spin up the infrastructure.  What can sometimes happen is these quick, one-off instances grow, and before you know it that little test DB is your production DB supporting your new app. For anyone who finds themselves in this situation, I encourage you to check out our Percona blogs as we have lots of great information for those both new and experienced with MongoDB.  Don’t let the ease of installing MongoDB fool you into a false sense of security, there are things you need to consider as a developer before deploying MongoDB.  Here are five things developers should know before deploying MongoDB in production.

1) Enable Authentication and Authorization

Security is of utmost importance to your database.  While gone are the days when security was disabled by default for MongoDB, it’s still easy to start MongoDB without security.  Without security and with your database bound to a public IP, anyone can connect to your database and steal your data.   By simply adding some important security configuration options to your configuration file, you can ensure that your data is protected.  You can also configure MongoDB to utilize native LDAP or Kerberos for authentication.  Setting up authentication and authorization is one of the simplest ways to ensure that your MongoDB database is secure.  The most important configuration option is turning on authorization which enables users and roles and requires you to authenticate and have the proper roles to access your data.

security:
  authorization: enabled
  keyfile: /path/to/our.keyfile

 

2) Connect to a Replica Set/Multiple Mongos, Not Individual Nodes

MongoDB’s drivers all support connecting directly to a standalone node, a replica set, or a mongos for sharded clusters.   Sometimes your database starts off with one specific node that is always your primary.  It’s easy to set your connection string to only connect to that one node.   But what happens when that one node goes down?   If you don’t have a highly available connection string in your application configuration, then you’re missing out on a key advantage of MongoDB replica sets. Connect to the primary no matter which node it is.  All of MongoDB’s supported language drivers support the MongoDB URI connection string format and implement the failover logic.  Here are some examples of connection strings for PyMongo, MongoDB’s Python Driver, of a standalone connection string, a replica set, and an SRV record connection string.  If you have the privilege to set up SRV DNS records, it allows you to standardize your connection string to point to an address without needing to worry about the underlying infrastructure getting changed.

Standalone Connection String:

client = MongoClient('mongodb://hostabc.example.com:27017/?authSource=admin')

 

Replica Set Connection String:

client = MongoClient('mongodb://hostabc.example.com:27017,hostdef:27017,hostxyz.example.com/?replicaSet=foo&authSource=admin')

 

SRV Connection String:

client = MongoClient('mongodb+srv://host.example.com/')

Post-script for clusters: If you’re just starting you’re usually not setting up a sharded cluster. But if it is a cluster then instead of using a replicaset connection you will connect to a mongos node. To get automatic failover in the event of mongos node being restarted (or otherwise being down) start them on multiple hosts and put them, comma-concatenated, in your connection string’s host list. As with replicasets you can use SRV records for these too.

3) Sharding Can Help Performance But Isn’t Always a Silver Bullet

Sharding is how MongoDB handles the partitioning of data.  This practice is used to distribute load across more replicasets for a variety of reasons such as write performance, low-latency geographical writes, and archiving data to shards utilizing slower and cheaper storage.   These sharding approaches are helpful in keeping your working set in memory because it lowers the amount of data each shard has to deal with.

As previously mentioned, sharding can also be used to reduce latency by separating your shards by geographic region, a common example if having a US-based shard, an EU-based shard, and a shard in Asia where the data is kept local to its origin.  Although it is not the only application for shard zoning “Geo-sharding” like this is a common one. This approach can also help applications comply with various data regulations that are becoming more important and more strict throughout the world.

While sharding can oftentimes help write performance, that sometimes comes at the detriment of read performance.  An easy example of a poor read performance would be if we needed to run a query to find all of the orders regardless of their origin. This find query would need to be sent to the US shard, the EU shard, and the shard in Asia, with all the network latency that comes with reading from the non-local regions, and then it would need to sort all the returned records on the mongos query router before returning them to the client. This kind of give and take should help you determine what approach you take to choosing a shard key and weighing its impact on your typical query patterns.

4) Replication ? Backups

MongoDB Replication, while powerful and easy to set up, is not a substitution for a good backup strategy.  Some might think that their replica set members in a DR data center will be sufficient to keep them up in a data loss scenario.   While a replica set member in a DR center will surely help you in a DR situation, it will not help you if you accidentally drop a database or a collection in production as that delete will quickly be replicated to your secondary in your DR data center.

Other common misconceptions are that delayed replica set members keep you safe.   Delayed members still rely on you finding the issue you want to restore from before it gets applied to your delayed member.  Are your processes that rock-solid that you can guarantee that you’ll find the issue before it reaches your delayed member?

Backups are just as important with MongoDB as they were with any other database.  There are tools like mongodump, mongoexport, Percona Backup for MongoDB, and Ops Manager (Enterprise Edition only) that support Point In Time Recovery, Oplog backups, Hot Backups, full and Incremental Backups.  As mentioned, Backups can be run from any node in your replica set.  The best practice is to run your backup from a secondary node so you don’t put unnecessary pressure on your primary node.   In addition to the above methods, you can also take snapshots of your data, this is possible as long as you pause writes to the node that you’re snapshotting by freezing the file system to ensure a consistent snapshot of your MongoDB database.

5) Schemaless is a Myth, Schemas Still Matter

MongoDB was originally touted as a schemaless database, this was attractive to developers who had long struggled to update and maintain their schemas in relational databases.   But these schemas succeeded for good reasons in the early days of databases and while MongoDB allowed you the flexibility to not set up your schema and create it on the fly, this often led to some poor-performing schema designs and anti-patterns.   There are lots of stories out in the wild of users not enforcing any structured schema on their MongoDB data models and running into various performance problems as their schema began to become unwieldy.  Today, MongoDB supports JSON schema and schema validation.  These approaches allow you to apply as much or as little structure to your schemas as is needed, so you still have the flexibility of MongoDB’s looser schema structure while still enforcing schema rules that will keep your application performing well and your data model consistent.

Another aspect that is affected by poor schema design in MongoDB is its aggregation framework.   The aggregation framework lets you do more analytical query patterns such as sorting, grouping, and some useful things such as unwinding of arrays and supporting joins and a whole lot more.  Without a good schema, these sorts of queries can really suffer poor performance.

MongoDB was also popular due to its lack of support for joins. Joins can be expensive and avoiding them allowed MongoDB to run quite fast.  Though MongoDB has since added $lookup to support left outer joins, embedded documents are a typical workaround to this approach.   This approach comes with its pros and cons.  As with relational databases, embedding documents is essentially creating a One-to-N relationship, this is covered in greater detail in this blog.  In MongoDB, the value of N matters, if it’s One-to-few (2-10), one-to-many,(10-1000) this can still be a good schema design as long as your indexes support your queries.   When you get to one-to-tons(10000+) this is where you need to consider things like MongoDB’s 16 MB limit per document or using references to the parent document.

Examples of each of these approaches:

One-to-Few, consider having multiple phone numbers for a user:

{  "_id" : ObjectId("1234567890"),
  "name" :  "John Doe",
  "phone" : [     
     { "type" : "mobile", "number" : "+1-585-555-5555" }, 
     { "type" : "work", "number" : "+1-585-555-1111"}  
            ]
}

One-to-Many, consider a parts list for a product with multiple items:

{ "_id" : ObjectId("123"),
 “Item” : “Widget”,
 “Price” : 100 
}

{  "_id" : ObjectId("0123456789"), 
   "manufacturer" : "Percona",
   "catalog_number" : 123456,
   "parts" : [    
      { “item”: ObjectID("123")},  
      { “item”: ObjectID("456")},
      { “item”: ObjectID("789")},
       ...  
              ] 
}

One-to-Tons, consider a social network type application:

{  "_id" : ObjectId("123"),
   "username" : "Jane Doe" 
}

{  "_id" : ObjectId("456"),
   "username" : "Eve DBA"
 }

{  "_id" : ObjectId("9876543210"),
   "username" : "Percona",
   "followers" : [     
                    ObjectID("123"),
                    ObjectID("456"),
                    ObjectID("789"),
                    ...  
                 ]
}

 

Bonus Topic: Transactions

MongoDB supports multi-document transactions since MongoDB 4.0 (replica sets) and MongoDB 4.2 (sharded clusters).  Transactions in MongoDB work quite similarly to how they work in relational databases.   That is to say that either all actions in the transaction succeed or they all fail.  Here’s an example of a transaction in MongoDB:

rs1:PRIMARY> session.startTransaction() 
rs1:PRIMARY> session.getDatabase("percona").test.insert({today : new Date()})
WriteResult({ "nInserted" : 1 })
rs1:PRIMARY> session.getDatabase("percona").test.insert({some_value : "abc"})
WriteResult({ "nInserted" : 1 }) 
rs1:PRIMARY> session.commitTransaction()

Transactions can be quite powerful if they are truly needed for your application, but do realize the performance implications as all queries in a transaction will wait to finish until the whole transaction succeeds or fails.

Takeaways:

While MongoDB is easy to get started with and has a lower barrier to entry, just like any other database there are some key things that you, as a developer, should consider before deploying MongoDB.   We’ve covered enabling authentication and authorization to ensure you have a secure application and don’t leak data.   We’ve highlighted using Highly available connection strings, whether to your replica set, a mongos node list, or utilizing SRV, to ensure you’re always connecting to the appropriate nodes.  The balancing act of ensuring that when you select your shard key you consider the impact to both reads and writes and understand the tradeoffs that you are making.   The importance of backups and to not rely on replication as a backup method was also covered.  Finally, we covered the fact that schemas still matter with MongoDB, but you still have flexibility in defining how rigid it is. We hope this helps you have a better idea about things to consider when deploying MongoDB for your next application and to be able to understand it better.  Thanks for reading!

Oct
15
2020
--

Application security platform NeuraLegion raises $4.7 million seed led by DNX Ventures

A video call group photo of NeuraLegion's team working remotely around the world

A video call group photo of NeuraLegion’s team working remotely around the world

Application security platform NeuraLegion announced today it has raised a $4.7 million seed round led by DNX Ventures, an enterprise-focused investment firm. The funding included participation from Fusion Fund, J-Ventures and Incubate Fund. The startup also announced the launch of a new self-serve, community version that allows developers to sign up on their own for the platform and start performing scans within a few minutes.

Based in Tel Aviv, Israel, NeuraLegion also has offices in San Francisco, London and Mostar, Bosnia. It currently offers NexDAST for dynamic application security testing, and NexPLOIT to integrate application security into SDLC (software development life cycle). It was launched last year by a founding team that includes chief executive Shoham Cohen, chief technology officer Bar Hofesh, chief scientist Art Linkov and president and chief commercial officer Gadi Bashvitz.

When asked who NeuraLegion views as its closest competitors, Bashvitz said Invicti Security and WhiteHat Security. Both are known primarily for their static application security testing (SAST) solutions, which Bashvitz said complements DAST products like NeuraLegion’s.

“These are complementary solutions and in fact we have some information partnerships with some of these companies,” he said.

Where NeuraLegion differentiates from other application security solutions, however, is that it was created specifically for developers, quality assurance and DevOps workers, so even though it can also be used by security professionals, it allows scans to be run much earlier in the development process than usual while lowering costs.

Bashvitz added that NeuraLegion is now used by thousands of developers through their organizations, but it is releasing its self-serve, community product to make its solutions more accessible to developers, who can sign up on their own, run their first scans and get results within 15 minutes.

In a statement about the funding, DNX Ventures managing partner Hiro Rio Maeda said, “The DAST market has been long stalled without any innovative approaches. NeuraLegion’s next-generation platform introduces a new way of conducting robust testing in today’s modern CI/CD environment.”

Aug
18
2020
--

Melbourne-based CI/CD platform Buildkite gets $28 million AUD Series A led by OpenView

Buildkite’s founding team — Lachlan Donald, Keith Pitt and Tim Lucas — working remotely

Buildkite, a Melbourne-based company that provides a hybrid continuous integration and continuous delivery (CI/CD) platform for software developers, announced today that it has raised AUD $28 million (about USD $20.2 million) in Series A funding, bringing its valuation to more than AUD $200 million (about USD $145 million).

The funding was led by OpenView, an investment firm that focuses on growth-stage enterprise software companies, with participation from General Catalyst.

This round is the company’s first since Buildkite raised about AUD $200,000 in seed funding when it was founded in 2013.

Co-founder and chief executive officer Lachlan Donald told TechCrunch that Buildkite didn’t seek more funding earlier because it was growing profitably. In fact, the company turned away interested investors “because we wanted to focus on sustainable growth and maintain control of our destiny.”

But Donald said they were open to investment from OpenView and General Catalyst because they see the two investors as “true partners as we enter and define this next generation of CI/CD.”

Buildkite’s team is small, with just 26 employees. “We’re a lean, focused team, so their expert advice and guidance will help more software teams around the world discover Buildkite,” Donald said. He added that part of the funding round will be used to give 42X returns to early investors and shareholders, and the rest will be used on product development.

In a statement about the funding, OpenView partner Mackey Craven said, “The global pandemic and the resulting economic uncertainty underlines the importance for companies to maximize efficiencies and build for growth. As the world continues to build digital-first applications, we believe Buildkite’s unique approach will be the new enterprise standard of CI/CD and we’re excited to be supporting them in realizing this ambition.”

Continuous integration gives software teams an automated way to develop and test applications, making collaboration more efficient, while continuous delivery refers to the process of pushing code to environments for further testing by other teams, or deploying it to customers. CI/CD platforms make it easier for fast-growing tech companies to test and deliver software. Buildkite says it now has more than 1,000 customers, including Shopify, Pinterest and Wayfair.

As part of the round, Jean-Michel Lemieux, Shopify’s chief technology officer, and Ashley Smith, chief revenue officer at Gatsby and OpenView venture partner, will join Buildkite’s board.

The increased use of online applications caused by the COVID-19 pandemic means there is more demand for CI/CD platform, since engineering teams need to work more quickly.

“A good example is Shopify, one of our longstanding partners. They came to us after they outgrew their previous hosted CI provider,” Donald said. “Their challenge is one we see across all of customers — they needed to reduce build time and scale their team across multiple time zones. Once they wrapped Buildkite into their development flow, they saw a 75% reduction in build wait times. They grew their team by 300% and have still been able to keep build time under 10 minutes.”

Other CI platforms available include Jenkins, CircleCI, Travis, Codeship and GitLab. Co-founder and chief technology officer Keith Pitt said one of the ways that BuildKite differentiates from its rivals is its focus on security, which prompted his interest in building the platform in the first place.

“Back in 2013, my then-employer asked that I stop using a cloud-based CI/CD platform due to security concerns, but I found the self-hosted alternatives to be incredibly outdated,” Pitt said. “I realized a hybrid approach was the solution for testing and deploying software at scale without compromising security or performance, but was surprised to find a hybrid CI/CD tool didn’t exist yet. I decided to create it myself, and Buildkite was born.”

Jun
24
2020
--

Why AWS built a no-code tool

AWS today launched Amazon Honeycode, a no-code environment built around a spreadsheet-like interface that is a bit of a detour for Amazon’s cloud service. Typically, after all, AWS is all about giving developers all of the tools to build their applications — but they then have to put all of the pieces together. Honeycode, on the other hand, is meant to appeal to non-coders who want to build basic line-of-business applications. If you know how to work a spreadsheet and want to turn that into an app, Honeycode is all you need.

To understand AWS’s motivation behind the service, I talked to AWS VP Larry Augustin and Meera Vaidyanathan, a general manager at AWS.

“For us, it was about extending the power of AWS to more and more users across our customers,” explained Augustin. “We consistently hear from customers that there are problems they want to solve, they would love to have their IT teams or other teams — even outsourced help — build applications to solve some of those problems. But there’s just more demand for some kind of custom application than there are available developers to solve it.”

Image Credits: Amazon

In that respect then, the motivation behind Honeycode isn’t all that different from what Microsoft is doing with its PowerApps low-code tool. That, too, after all, opens up the Azure platform to users who aren’t necessarily full-time developers. AWS is taking a slightly different approach here, though, but emphasizing the no-code part of Honeycode.

“Our goal with honey code was to enable the people in the line of business, the business analysts, project managers, program managers who are right there in the midst, to easily create a custom application that can solve some of the problems for them without the need to write any code,” said Augustin. “And that was a key piece. There’s no coding required. And we chose to do that by giving them a spreadsheet-like interface that we felt many people would be familiar with as a good starting point.”

A lot of low-code/no-code tools also allow developers to then “escape the code,” as Augstin called it, but that’s not the intent here and there’s no real mechanism for exporting code from Honeycode and take it elsewhere, for example. “One of the tenets we thought about as we were building Honeycode was, gee, if there are things that people want to do and we would want to answer that by letting them escape the code — we kept coming back and trying to answer the question, ‘Well, okay, how can we enable that without forcing them to escape the code?’ So we really tried to force ourselves into the mindset of wanting to give people a great deal of power without escaping to code,” he noted.

Image Credits: Amazon

There are, however, APIs that would allow experienced developers to pull in data from elsewhere. Augustin and Vaidyanathan expect that companies may do this for their users on tthe platform or that AWS partners may create these integrations, too.

Even with these limitations, though, the team argues that you can build some pretty complex applications.

“We’ve been talking to lots of people internally at Amazon who have been building different apps and even within our team and I can honestly say that we haven’t yet come across something that is impossible,” Vaidyanathan said. “I think the level of complexity really depends on how expert of a builder you are. You can get very complicated with the expressions [in the spreadsheet] that you write to display data in a specific way in the app. And I’ve seen people write — and I’m not making this up — 30-line expressions that are just nested and nested and nested. So I really think that it depends on the skills of the builder and I’ve also noticed that once people start building on Honeycode — myself included — I start with something simple and then I get ambitious and I want to add this layer to it — and I want to do this. That’s really how I’ve seen the journey of builders progress. You start with something that’s maybe just one table and a couple of screens, and very quickly, before you know, it’s a far more robust app that continues to evolve with your needs.”

Another feature that sets Honeycode apart is that a spreadsheet sits at the center of its user interface. In that respect, the service may seem a bit like Airtable, but I don’t think that comparison holds up, given that both then take these spreadsheets into very different directions. I’ve also seen it compared to Retool, which may be a better comparison, but Retool is going after a more advanced developer and doesn’t hide the code. There is a reason, though, why these services were built around them and that is simply that everybody is familiar with how to use them.

“People have been using spreadsheets for decades,” noted Augustin. “They’re very familiar. And you can write some very complicated, deep, very powerful expressions and build some very powerful spreadsheets. You can do the same with Honeycode. We felt people were familiar enough with that metaphor that we could give them that full power along with the ability to turn that into an app.”

The team itself used the service to manage the launch of Honeycode, Vaidyanathan stressed — and to vote on the name for the product (though Vaidyanathan and Augustin wouldn’t say which other names they considered.

“I think we have really, in some ways, a revolutionary product in terms of bringing the power of AWS and putting it in the hands of people who are not coders,” said Augustin.

Nov
21
2019
--

Linear takes $4.2M led by Sequoia to build a better bug tracker and more

Software will eat the world, as the saying goes, but in doing so, some developers are likely to get a little indigestion. That is to say, building products requires working with disparate and distributed teams, and while developers may have an ever-growing array of algorithms, APIs and technology at their disposal to do this, ironically the platforms to track it all haven’t evolved with the times. Now three developers have taken their own experience of that disconnect to create a new kind of platform, Linear, which they believe addresses the needs of software developers better by being faster and more intuitive. It’s bug tracking you actually want to use.

Today, Linear is announcing a seed round of $4.2 million led by Sequoia, with participation also from Index Ventures and a number of investors, startup founders and others that will also advise Linear as it grows. They include Dylan Field (Founder and CEO, Figma), Emily Choi (COO, Coinbase), Charlie Cheever (Co-Founder of Expo & Quora), Gustaf Alströmer (Partner, Y Combinator), Tikhon Berstram (Co-Founder, Parse), Larry Gadea (CEO, Envoy), Jude Gomila (CEO, Golden), James Smith (CEO, Bugsnag), Fred Stevens-Smith (CEO, Rainforest), Bobby Goodlatte, Marc McGabe, Julia DeWahl and others.

Cofounders Karri Saarinen, Tuomas Artman, and Jori Lallo — all Finnish but now based in the Bay Area — know something first-hand about software development and the trials and tribulations of working with disparate and distributed teams. Saarinen was previously the principal designer of Airbnb, as well as the first designer of Coinbase; Artman had been staff engineer and architect at Uber; and Lallo also had been at Coinbase as a senior engineer building its API and front end.

“When we worked at many startups and growth companies we felt that the tools weren’t matching the way we’re thinking or operating,” Saarinen said in an email interview. “It also seemed that no-one had took a fresh look at this as a design problem. We believe there is a much better, modern workflow waiting to be discovered. We believe creators should focus on the work they create, not tracking or reporting what they are doing. Managers should spend their time prioritizing and giving direction, not bugging their teams for updates. Running the process shouldn’t sap your team’s energy and come in the way of creating.”

Linear cofounders (from left): KarriSaarinen, Jori Lallo, and Tuomas Artma

All of that translates to, first and foremost, speed and a platform whose main purpose is to help you work faster. “While some say speed is not really a feature, we believe it’s the core foundation for tools you use daily,” Saarinen noted.

A ?K command calls up a menu of shortcuts to edit an issue’s status, assign a task, and more so that everything can be handled with keyboard shortcuts. Pages load quickly and synchronise in real time (and search updates alongside that). Users can work offline if they need to. And of course there is also a dark mode for night owls.

The platform is still very much in its early stages. It currently has three integrations based on some of the most common tools used by developers — GitHub (where you can link Pull Requests and close Linear issues on merge), Figma designs (where you can get image previews and embeds of Figma designs), and Slack (you can create issues from Slack and then get notifications on updates). There are plans to add more over time.

We started solving the problem from the end-user perspective, the contributor, like an engineer or a designer and starting to address things that are important for them, can help them and their teams,” Saarinen said. “We aim to also bring clarity for the teams by making the concepts simple, clear but powerful. For example, instead of talking about epics, we have Projects that help track larger feature work or tracks of work.”

Indeed, speed is not the only aim with Linear. Saarinen also said another area they hope to address is general work practices, with a take that seems to echo a turn away from time spent on manual management and more focus on automating that process.

“Right now at many companies you have to manually move things around, schedule sprints, and all kinds of other minor things,” he said. “We think that next generation tools should have built in automated workflows that help teams and companies operate much more effectively. Teams shouldn’t spend a third or more of their time a week just for running the process.”

The last objective Linear is hoping to tackle is one that we’re often sorely lacking in the wider world, too: context.

“Companies are setting their high-level goals, roadmaps and teams work on projects,” he said. “Often leadership doesn’t have good visibility into what is actually happening and how projects are tracking. Teams and contributors don’t always have the context or understanding of why they are working on the things, since you cannot follow the chain from your task to the company goal. We think that there are ways to build Linear to be a real-time picture of what is happening in the company when it comes to building products, and give the necessary context to everyone.”

Linear is a late entrant in a world filled with collaboration apps, and specifically workflow and collaboration apps targeting the developer community. These include not just Slack and GitHub, but Atlassian’s Trello and Jira, as well as Asana, Basecamp and many more.

Saarinen would not be drawn out on which of these (or others) that it sees as direct competition, noting that none are addressing developer issues of speed, ease of use and context as well as Linear is.

“There are many tools in the market and many companies are talking about making ‘work better,’” he said. “And while there are many issue tracking and project management tools, they are not supporting the workflow of the individual and team. A lot of the value these tools sell is around tracking work that happens, not actually helping people to be more effective. Since our focus is on the individual contributor and intelligent integration with their workflow, we can support them better and as a side effect makes the information in the system more up to date.”

Stephanie Zhan, the partner at Sequoia whose speciality is seed and Series A investments and who has led this round, said that Linear first came on her radar when it first launched its private beta (it’s still in private beta and has been running a waitlist to bring on new users. In that time it’s picked up hundreds of companies, including Pitch, Render, Albert, Curology, Spoke, Compound and YC startups including Middesk, Catch and Visly). The company had also been flagged by one of Sequoia’s Scouts, who invested earlier this year

Sequoia Logo Natalie Miyake

Although Linear is based out of San Francisco, it’s interesting that the three founders’ roots are in Finland (with Saarinen in Helsinki this week to speak at the Slush event), and brings up an emerging trend of Silicon Valley VCs looking at founders from further afield than just their own back yard.

“The interesting thing about Linear is that as they’re building a software company around the future of work, they’re also building a remote and distributed team themselves,” Zahn said. The company currently has only four employees.

In that vein, we (and others, it seems) had heard that Sequoia — which today invests in several Europe-based startups, including Tessian, Graphcore, Klarna, Tourlane, Evervault  and CEGX — has been considering establishing a more permanent presence in this part of the world, specifically in London.

Sources familiar with the firm, however, tell us that while it has been sounding out VCs at other firms, saying a London office is on the horizon might be premature, as there are as yet no plans to set up shop here. However, with more companies and European founders entering its portfolio, and as more conversations with VCs turn into decisions to make the leap to help Sequoia source more startups, we could see this strategy turning around quickly.

Jun
10
2019
--

AWS is now making Amazon Personalize available to all customers

Amazon Personalize, first announced during AWS re:Invent last November, is now available to all Amazon Web Services customers. The API enables developers to add custom machine learning models to their apps, including ones for personalized product recommendations, search results and direct marketing, even if they don’t have machine learning experience.

The API processes data using algorithms originally created for Amazon’s own retail business,  but the company says all data will be “kept completely private, owned entirely by the customer.” The service is now available to AWS users in three U.S. regions, East (Ohio), East (North Virginia) and West (Oregon), two Asia Pacific regions (Tokyo and Singapore) and Ireland in the European Union, with more regions to launch soon.

AWS customers who have already added Amazon Personalize to their apps include Yamaha Corporation of America, Subway, Zola and Segment. In Amazon’s press release, Yamaha Corporation of America Director of Information Technology Ishwar Bharbhari said Amazon Personalize “saves us up to 60% of the time needed to set up and tune the infrastructure and algorithms for our machine learning models when compared to building and configuring the environment on our own.”

Amazon Personalize’s pricing model charges five cents per GB of data uploaded to Amazon Personalize and 24 cents per training hour used to train a custom model with their data. Real-time recommendation requests are priced based on how many are uploaded, with discounts for larger orders.

May
06
2019
--

Microsoft and GitHub grow closer

Microsoft’s $7.5 billion acquisition of GitHub closed last October. Today, at its annual Build developer conference, Microsoft announced a number of new integrations between its existing services and GitHub. None of these are earth-shattering or change the nature of any of GitHub’s fundamental features, but they do show how Microsoft is starting to bring GitHub closer into the fold.

It’s worth noting that Microsoft isn’t announcing any major GitHub features at Build, though it was only a few weeks ago that the company made a major change by giving GitHub Free users access to unlimited private repositories. For major feature releases, GitHub has its own conference.

So what are the new integrations? Most of them center around identity management. That means GitHub Enterprise users can now use Azure Active Directory to access GitHub. Developers will also be able to use their existing GitHub accounts to log into Azure features like the Azure Portal and Azure DevOps. “This update enables GitHub developers to go from repository to deployment with just their GitHub account,” Microsoft argues in its release announcement.

As far as selling GitHub goes, Microsoft also today announced a new Visual Studio subscription with access to GitHub Enterprise for Microsoft’s Enterprise Agreement customers. Given that there is surely a lot of overlap between Visual Studio’s enterprise customers and GitHub Enterprise users, this move makes sense. Chances are, it’ll also make moving to GitHub Enterprise more enticing for current Visual Studio subscribers.

Lastly, the Azure Boards app, which offers features like Kanban boards and sprint planning tools, is now also available in the GitHub Marketplace.

Apr
24
2019
--

Docker developers can now build Arm containers on their desktops

Docker and Arm today announced a major new partnership that will see the two companies collaborate in bringing improved support for the Arm platform to Docker’s tools.

The main idea here is to make it easy for Docker developers to build their applications for the Arm platform right from their x86 desktops and then deploy them to the cloud (including the Arm-based AWS EC2 A1 instances), edge and IoT devices. Developers will be able to build their containers for Arm just like they do today, without the need for any cross-compilation.

This new capability, which will work for applications written in JavaScript/Node.js, Python, Java, C++, Ruby, .NET core, Go, Rust and PHP, will become available as a tech preview next week, when Docker hosts its annual North American developer conference in San Francisco.

Typically, developers would have to build the containers they want to run on the Arm platform on an Arm-based server. With this system, which is the first result of this new partnership, Docker essentially emulates an Arm chip on the PC for building these images.

“Overnight, the 2 million Docker developers that are out there can use the Docker commands they already know and become Arm developers,” Docker EVP of Strategic Alliances David Messina told me. “Docker, just like we’ve done many times over, has simplified and streamlined processes and made them simpler and accessible to developers. And in this case, we’re making x86 developers on their laptops Arm developers overnight.”

Given that cloud-based Arm servers like Amazon’s A1 instances are often significantly cheaper than x86 machines, users can achieve some immediate cost benefits by using this new system and running their containers on Arm.

For Docker, this partnership opens up new opportunities, especially in areas where Arm chips are already strong, including edge and IoT scenarios. Arm, similarly, is interested in strengthening its developer ecosystem by making it easier to develop for its platform. The easier it is to build apps for the platform, the more likely developers are to then run them on servers that feature chips from Arm’s partners.

“Arm’s perspective on the infrastructure really spans all the way from the endpoint, all the way through the edge to the cloud data center, because we are one of the few companies that have a presence all the way through that entire path,” Mohamed Awad, Arm’s VP of Marketing, Infrastructure Line of Business, said. “It’s that perspective that drove us to make sure that we engage Docker in a meaningful way and have a meaningful relationship with them. We are seeing compute and the infrastructure sort of transforming itself right now from the old model of centralized compute, general purpose architecture, to a more distributed and more heterogeneous compute system.”

Developers, however, Awad rightly noted, don’t want to have to deal with this complexity, yet they also increasingly need to ensure that their applications run on a wide variety of platforms and that they can move them around as needed. “For us, this is about enabling developers and freeing them from lock-in on any particular area and allowing them to choose the right compute for the right job that is the most efficient for them,” Awad said.

Messina noted that the promise of Docker has long been to remove the dependence of applications from the infrastructure on which they run. Adding Arm support simply extends this promise to an additional platform. He also stressed that the work on this was driven by the company’s enterprise customers. These are the users who have already set up their systems for cloud-native development with Docker’s tools — at least for their x86 development. Those customers are now looking at developing for their edge devices, too, and that often means developing for Arm-based devices.

Awad and Messina both stressed that developers really don’t have to learn anything new to make this work. All of the usual Docker commands will just work.

Jan
07
2019
--

GitHub Free users now get unlimited private repositories

If you’re a GitHub user, but you don’t pay, this is a good week. Historically, GitHub always offered free accounts but the caveat was that your code had to be public. To get private repositories, you had to pay. Starting tomorrow, that limitation is gone. Free GitHub users now get unlimited private projects with up to three collaborators.

The amount of collaborators is really the only limitation here and there’s no change to how the service handles public repositories, which can still have unlimited collaborators.

This feels like a sign of goodwill on behalf of Microsoft, which closed its acquisition of GitHub last October, with former Xamarin CEO Nat Friedman taking over as GitHub’s CEO. Some developers were rather nervous about the acquisition (though it feels like most have come to terms with it). It’s also a fair guess to assume that GitHub’s model for monetizing the service is a bit different from Microsoft’s. Microsoft doesn’t need to try to get money from small teams — that’s not where the bulk of its revenue comes from. Instead, the company is mostly interested in getting large enterprises to use the service.

Talking about teams, GitHub also today announced that it is changing the name of the GitHub Developer suite to ‘GitHub Pro.’ The company says it’s doing so in order to “help developers better identify the tools they need.”

But what’s maybe even more important is that GitHub Business Cloud and GitHub Enterprise (now called Enterprise Cloud and Enterprise Server) have become one and are now sold under the ‘GitHub Enterprise’ label and feature per-user pricing.

The announcement of free private repositories probably took some of GitHub’s competitors by surprise, but here is what we heard from GitLab CEO Sid Sijbrandij: “GitHub today announced the launch of free private repositories with up to three collaborators. GitLab has offered unlimited collaborators on private repositories since the beginning.We believe Microsoft is focusing more on generating revenue with Azure and less on charging for DevOps software. At GitLab, we believe in a multi-cloud future where organizations use multiple public cloud platforms.”

Note: this story was scheduled for tomorrow, but due to a broken embargo, we decided to publish today. The updates will go live tomorrow.

Nov
29
2018
--

AWS launches a managed Kafka service

Kafka is an open-source tool for handling incoming streams of data. Like virtually all powerful tools, it’s somewhat hard to set up and manage. Today, Amazon’s AWS is making this all a bit easier for its users with the launch of Amazon Managed Streaming for Kafka. That’s a mouthful, but it’s essentially Kafka as a fully managed, highly available service on AWS. It’s now available on AWS as a public preview.

As AWS CTO Werner Vogels noted in his AWS re:Invent keynote, Kafka users traditionally had to do a lot of heavy lifting to set up a cluster on AWS and to ensure that it could scale and handle failures. “It’s a nightmare having to restart all the cluster and the main nodes,” he said. “This is what I would call the traditional heavy lifting that AWS is really good at solving for you.”

It’s interesting to see AWS launch this service, given that it already offers a very similar tool in Kinesis, a tool that also focuses on ingesting streaming data. There are plenty of applications on the market today that already use Kafka, and AWS is clearly interested in giving those users a pathway to either move to a managed Kafka service or to AWS in general.

As with all things AWS, the pricing is a bit complicated, but a basic Kafka instance will start at $0.21 per hour. You’re not likely to just use one instance, so for a somewhat useful setup with three brokers and a good amount of storage and some other fees, you’ll quickly pay well over $500 per month.

more AWS re:Invent 2018 coverage

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com