Jul
12
2018
--

GitHub Enterprise and Business Cloud users now get access to public repos, too

GitHub, the code hosting service Microsoft recently acquired, is launching a couple of new features for its business users today that’ll make it easier for them to access public repositories on the service.

Traditionally, users on the hosted Business Cloud and self-hosted Enterprise were not able to directly access the millions of public open-source repositories on the service. Now, with the service’s release, that’s changing, and business users will be able to reach beyond their firewalls to engage and collaborate with the rest of the GitHub community directly.

With this, GitHub now also offers its business and enterprise users a new unified search feature that lets them tap into their internal repos but also look at open-source ones.

Other new features in this latest Enterprise release include the ability to ignore whitespace when reviewing changes, the ability to require multiple reviewers for code changes, automated support tickets and more. You can find a full list of all updates here.

Microsoft’s acquisition of GitHub wasn’t fully unexpected (and it’s worth noting that the acquisition hasn’t closed yet), but it is still controversial, given that Microsoft and the open-source community, which heavily relies on GitHub, haven’t always seen eye-to-eye in the past. I’m personally not too worried about that, and it feels like the dust has settled at this point and that people are waiting to see what Microsoft will do with the service.

Apr
20
2018
--

Kubernetes and Cloud Foundry grow closer

Containers are eating the software world — and Kubernetes is the king of containers. So if you are working on any major software project, especially in the enterprise, you will run into it sooner or later. Cloud Foundry, which hosted its semi-annual developer conference in Boston this week, is an interesting example for this.

Outside of the world of enterprise developers, Cloud Foundry remains a bit of an unknown entity, despite having users in at least half of the Fortune 500 companies (though in the startup world, it has almost no traction). If you are unfamiliar with Cloud Foundry, you can think of it as somewhat similar to Heroku, but as an open-source project with a large commercial ecosystem and the ability to run it at scale on any cloud or on-premises installation. Developers write their code (following the twelve-factor methodology), define what it needs to run and Cloud Foundry handles all of the underlying infrastructure and — if necessary — scaling. Ideally, that frees up the developer from having to think about where their applications will run and lets them work more efficiently.

To enable all of this, the Cloud Foundry Foundation made a very early bet on containers, even before Docker was a thing. Since Kubernetes wasn’t around at the time, the various companies involved in Cloud Foundry came together to build their own container orchestration system, which still underpins much of the service today. As it took off, though, the pressure to bring support for Kubernetes grew inside of the Cloud Foundry ecosystem. Last year, the Foundation announced its first major move in this direction by launching its Kubernetes-based Container Runtime for managing containers, which sits next to the existing Application Runtime. With this, developers can use Cloud Foundry to run and manage their new (and existing) monolithic apps and run them in parallel with the new services they develop.

But remember how Cloud Foundry also still uses its own container service for the Application Runtime? There is really no reason to do that now that Kubernetes (and the various other projects in its ecosystem) have become the default of handling containers. It’s maybe no surprise then that there is now a Cloud Foundry project that aims to rip out the old container management systems and replace them with Kubernetes. The container management piece isn’t what differentiates Cloud Foundry, after all. Instead, it’s the developer experience — and at the end of the day, the whole point of Cloud Foundry is that developers shouldn’t have to care about the internal plumbing of the infrastructure.

There is another aspect to how the Cloud Foundry ecosystem is embracing Kubernetes, too. Since Cloud Foundry is also just software, there’s nothing stopping you from running it on top of Kubernetes, too. And with that, it’s no surprise that some of the largest Cloud Foundry vendors, including SUSE and IBM, are doing exactly that.

The SUSE Cloud Application Platform, which is a certified Cloud Foundry distribution, can run on any public cloud Kubernetes infrastructure, including the Microsoft Azure Container service. As the SUSE team told me, that means it’s not just easier to deploy, but also far less resource-intensive to run.

Similarly, IBM is now offering Cloud Foundry on top of Kubernetes for its customers, though it’s only calling this an experimental product for now. IBM’s GM of Cloud Developer Services Don Boulia stressed that IBM’s customers were mostly looking for ways to run their workloads in an isolated environment that isn’t shared with other IBM customers.

Boulia also stressed that for most customers, it’s not about Kubernetes versus Cloud Foundry. For most of his customers, using Kubernetes by itself is very much about moving their existing applications to the cloud. And for new applications, those customers are then opting to run Cloud Foundry.

That’s something the SUSE team also stressed. One pattern SUSE has seen is that potential customers come to it with the idea of setting up a container environment and then, over the course of the conversation, decide to implement Cloud Foundry as well.

Indeed, the message of this week’s event was very much that Kubernetes and Cloud Foundry are complementary technologies. That’s something Chen Goldberg, Google’s Director of Engineering for Container Engine and Kubernetes, also stressed during a panel discussion at the event.

Both the Cloud Foundry Foundation and the Cloud Native Computing Foundation (CNCF), the home of Kubernetes, are under the umbrella of the Linux Foundation. They take somewhat different approaches to their communities, with Cloud Foundry stressing enterprise users far more than the CNCF. There are probably some politics at play here, but for the most part, the two organizations seem friendly enough — and they do share a number of members. “We are part of CNCF and part of Cloud Foundry foundation,” Pivotal CEO Rob Mee told our own Ron Miller. “Those communities are increasingly sharing tech back and forth and evolving together. Not entirely independent and not competitive either. Lot of complexity and subtlety. CNCF and Cloud Foundry are part of a larger ecosystem with complimentary and converging tech.”

We’ll likely see more of this technology sharing — and maybe collaboration — between the CNCF and Cloud Foundry going forward. The CNCF is, after all, the home of a number of very interesting projects for building cloud-native applications that do have their fair share of use cases in Cloud Foundry, too.

Apr
18
2018
--

Cloud.gov makes Cloud Foundry easier to adopt for government agencies

At the Cloud Foundry Summit in Boston, the team behind the U.S. government’s cloud.gov application platform announced that it is now a certified Cloud Foundry platform that is guaranteed to be compatible with other certified providers, like Huawei, IBM, Pivotal, SAP and — also starting today — SUSE. With this, cloud.gov becomes the first government agency to become Cloud Foundry-certified.

The point behind the certification is to ensure that all of the various platforms that support Cloud Foundry are compatible with each other. In the government context, this means that agencies can easily move their workloads between clouds (assuming they have all the necessary government certifications in place). But what’s maybe even more important is that it also ensures skills portability, which should make hiring and finding contractors easier for these agencies. Given that the open source Cloud Foundry project has seen quite a bit of adoption in the private sector, with half of the Fortune 500 companies using it, that’s often an important factor for deciding which platform to build on.

From the outset, cloud.gov, which was launched by the General Services Administration’s 18F office to improve the U.S. government’s public-facing websites and applications, was built on top of Cloud Foundry. Similar agencies in Australia and the U.K. have made the same decision to standardize on the Cloud Foundry platform. Cloud Foundry launched its certification program a few years ago; last year it added another program for certifying the skills of individual developers.

To be able to run government workloads, a cloud platform has to offer a certain set of security requirements. As Cloud Foundry Foundation CTO Chip Childers told me, the work 18F did to get the FedRAMP authorization for cloud.gov helped bring better controls to the upstream project, too, and he stressed that all of the governments that have adopted the platform have contributed to the overall project.

Mar
22
2018
--

GitLab adds support for GitHub

Here is an interesting twist: GitLab, which in many ways competes with GitHub as a shared code repository service for teams, is bringing its continuous integration and delivery (CI/CD) features to GitHub.

The new service is launching today as part of GitLab’s hosted service. It will remain free to developers until March 22, 2019. After that, it’s moving to GitLab.com’s paid Silver tier.

GitHub itself offers some basic project and task management services on top of its core tools, but for the most part, it leaves the rest of the DevOps lifecycle to partners. GitLab offers a more complete CI/CD solution with integrated code repositories, but while GitLab has grown in popularity, GitHub is surely better known among developers and businesses. With this move, GitLab hopes to gain new users — and especially enterprise users — who are currently storing their code on GitHub but are looking for a CI/CD solution.

The new GitHub integration allows developers to set up their projects in GitLab and connect them to a GitHub repository. So whenever developers push code to their GitHub repository, GitLab will kick off that project’s CI/CD pipeline with automated builds, tests and deployments.

“Continuous integration and deployment form the backbone of modern DevOps,” said Sid Sijbrandij, CEO and co-founder of GitLab. “With this new offering, businesses and open source projects that use GitHub as a code repository will have access to GitLab’s industry leading CI/CD capabilities.”

It’s worth noting that GitLab offers a very similar integration with Atlassian’s BitBucket, too.

Feb
27
2018
--

IBM Watson CTO Rob High on bias and other challenges in machine learning

 For IBM Watson CTO Rob High, the biggest technological challenge in machine learning right now is figuring out how to train models with less data. “It’s a challenge, it’s a goal and there’s certainly reason to believe that it’s possible,” High told me during an interview at the annual Mobile World Congress in Barcelona. Read More

Feb
21
2018
--

Google’s Cloud IoT Core is now generally available

 Cloud IoT Core, Google’s fully managed service for connecting, managing and ingesting data from IoT devices, is now out of beta and generally available. Google envisions the service, which launched in public beta last September, as the first entry point for IoT data into its cloud. Once the data has been ingested, users can use Cloud IoT Core to push data to Google’s cloud… Read More

Feb
20
2018
--

Stride, Atlassian’s Slack competitor, opens its API to all developers

 The arrival of Stride, Atlassian’s Slack competitor, was probably the company’s biggest launch of 2017. While the company generally allows developers to easily integrate with its products, Stride’s API remained in closed beta for significantly longer than the product itself, which exited beta last September. Today, however, Atlassian is opening the Stride API to all developers. Read More

Jan
25
2018
--

IBM brings Mendix’s low-code platform to its cloud

 IBM today announced a partnership with low-code development platform Mendix that will bring Mendix and native integration with many of IBM’s Watson IoT and AI services to the IBM Cloud. This deal is an evolution of a previous partnership that involved what was then called IBM Bluemix (now IBM Cloud). Read More

Jan
23
2018
--

Unravel Data raises $15M Series B for its big data performance monitoring platform

 Big data systems tend to be large, complex and often hard to troubleshoot. In the world of databases, web and mobile stacks, application performance management services like AppDynamics and New Relic help ops teams keep tabs on their system. In the big data world, Unravel Data is one of the few APM players to focus solely on the complete big data stack from ingestion to analysis. Read More

Nov
14
2017
--

Common MongoDB Topologies

MongoDB Topologies

In this blog post, we’ll look at some common MongoDB topologies used in database deployments.

The question of the best architecture for MongoDB will arise in your conversations between developers and architects. In this blog, we wanted to go over the main sharded and unsharded designs, with their pros and cons.

We will first look at “Replica Sets.” Replica sets are the most basic form of high availability (HA) in MongoDB, and the building blocks for sharding. From there, we will cover sharding approaches and if you need to go that route.

Replica Set

Replica Set

From the MongoDB manual:

replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments.

Short of sharding, this is the ideal way to run MongoDB. Things like high availability, failover and recovery become automated with no action typically needed. If you expect large growth or more than 200G of data, you should consider using this plus sharding to reduce your mean time to recovery on a restore from backup.

Pros:

  • Elections happen automatically and unnoticed by application setup with retry
  • Rebuilding a new node, or adding an additional read-only node, is as easy as  “rs.add(‘hostname’)”
  • Can skip building indexes to improve write speed
  • Can have members
    • hidden in other geographic location
    • delayed replication
    • analytics nodes via taggings

Cons:

  • Depending on the size of the oplog used, you can use  10-100+% more space to hold to change data for replication
  • You must scale up not out meaning more expensive hardware
  • Recovery using a sharded approach is faster than having is all on a single node ( parallelism)

Flat Mongos (not load balanced)

Flat Mongos

This is one of MongoDB’s more suggested deployment designs. To understand why, we should talk about the driver and the fact that it supports a CSV list of mongos hosts for fail-over.

You can’t distribute writes in a single replica set. Instead, they all need to go to the primary node. You can distribute reads to the secondaries using Read Preferences. The driver keeps track of what is a primary and what is a secondary and routes queries appropriately.

Conceptually, the driver should have connections bucketed into the mongos they go to. This allowed the 3.0+ driver to be semi-stateless and ask any connection to a specific mongos to preform a getMore to that mongos. In theory, this allows slightly more concurrency. Realistically you only use one mongos, since this is only a fail-over system.

Pros:

  • Mongos is on its own gear, so it will not run the application out of memory
  • If Mongos doesn’t respond, the driver “fails-over” to the next in the list
  • Can be put closer to the database or application depending on your network and sorting needs

Cons:

  • You can’t use mongos in a list evenly, so it is only good for fail-over (not evenness) in most drivers. Please read specific drivers for support, and test thoroughly.

Load Balanced (preferred if possible)

Load Balanced

According to the Mongo docs:

You may also deploy a group of mongos instances and use a proxy/load balancer between the application and the mongos. In these deployments, you must configure the load balancer for client affinity so that every connection from a single client reaches the same mongos.

This is the model used by platforms such as ObjectRocket. In this pattern, you move mongos nodes to their own tier but then put them behind a load-balancer. In this design, you can even out the use of mongos by using a least-connection system. The challenge, however, is new drivers have issues with getMores. By this we mean the getMore selects a new random connection, and the load balancer can’t be sure which mongos should get it. Thus it has a one in N (number of mongos) chance of selecting the right one, or getting a “Cursor Not Found” error.

Pros:

  • Ability to have an even use of mongos
  • Mongos are separated from each other and the applications to prevent memory and CPU contention
  • You can easily remove or add mongos to help scale the layer without code changes
  • High availability at every level (multiple mongos, multiple configs, ReplSet for high availability and even multiple applications for app failures)

Cons:

  • If batching is used, unless switched to an IP pinning algorithm (which loses evenness) you can get “Cursor Not Found” errors due to the wrong mongos getting getMore and bulk connector connections

App-Centric Mongos

Appcentric Mongos

By and large, this is one of the most typical deployment designs for MongoDB sharding. In it, we have each application host talking to a mongos on the local network interface. This ensures there is very little latency to the application from the mongos.

Additionally, this means if a mongos fails, at most its own host is affected instead of the wider range of all application hosts.

Pros:

  • Local mongos on the loopback interface mean low to no latency
  • Limited scope of outage if this mongos fails
  • Can be geographically farther from the data storage in cases where you have a DR site

Cons:

  • Mongos is a memory hog; you could steal from your application memory to support running it here
    • Made worse with large batches, many connections, and sorting
  • Mongos is single-threaded and could become a bottleneck for your application
  • It is possible for a slow network to cause bad decision making, including duplicate databases on different shards. The functional result is data writing intermittently to two locations, and a DBA must remediate that at some point (think MMM VIP ping pong issues)
  • All sorting and limits are applied on the application host. In cases where the sort uses an index this is OK, but if not indexed the entire result set must be held in memory by mongos and then sorted, then returned the limited number of results to the client. This is the typical cause of mongos OOM’s errors due to the memory issues listed before.

Conclusion

The topologies are above cover many of the deployment needs for MongoDB environments. Hope this helps, and list any questions in the comments below.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com