Jun
02
2021
--

Iterative raises $20M for its MLOps platform

Iterative, an open-source startup that is building an enterprise AI platform to help companies operationalize their models, today announced that it has raised a $20 million Series A round led by 468 Capital and Mesosphere co-founder Florian Leibert. Previous investors True Ventures and Afore Capital also participated in this round, which brings the company’s total funding to $25 million.

The core idea behind Iterative is to provide data scientists and data engineers with a platform that closely resembles a modern GitOps-driven development stack.

After spending time in academia, Iterative co-founder and CEO Dmitry Petrov joined Microsoft as a data scientist on the Bing team in 2013. He noted that the industry has changed quite a bit since then. While early on, the questions were about how to build machine learning models, today the problem is how to build predictable processes around machine learning, especially in large organizations with sizable teams. “How can we make the team productive, not the person? This is a new challenge for the entire industry,” he said.

Big companies (like Microsoft) were able to build their own proprietary tooling and processes to build their AI operations, Petrov noted, but that’s not an option for smaller companies.

Currently, Iterative’s stack consists of a couple of different components that sit on top of tools like GitLab and GitHub. These include DVC for running experiments and data and model versioning, CML, the company’s CI/CD platform for machine learning, and the company’s newest product, Studio, its SaaS platform for enabling collaboration between teams. Instead of reinventing the wheel, Iterative essentially provides data scientists who already use GitHub or GitLab to collaborate on their source code with a tool like DVC Studio that extends this to help them collaborate on data and metrics, too.

Image Credits: Iterative

“DVC Studio enables machine learning developers to run hundreds of experiments with full transparency, giving other developers in the organization the ability to collaborate fully in the process,” said Petrov. “The funding today will help us bring more innovative products and services into our ecosystem.”

Petrov stressed that he wants to build an ecosystem of tools, not a monolithic platform. When the company closed this current funding round about three months ago, Iterative had about 30 employees, many of whom were previously active in the open-source community around its projects. Today, that number is already closer to 60.

“Data, ML and AI are becoming an essential part of the industry and IT infrastructure,” said Leibert, general partner at 468 Capital. “Companies with great open-source adoption and bottom-up market strategy, like Iterative, are going to define the standards for AI tools and processes around building ML models.”

Mar
22
2021
--

Storing Kubernetes Operator for Percona Server for MongoDB Secrets in Github

storing kubernetes MongoDB secrets github

storing kubernetes MongoDB secrets githubMore and more companies are adopting GitOps as the way of implementing Continuous Deployment. Its elegant approach built upon a well-known tool wins the hearts of engineers. But even if your git repository is private, it’s strongly discouraged to store keys and passwords in unencrypted form.

This blog post will show how easy it is to use GitOps and keep Kubernetes secrets for Percona Kubernetes Operator for Percona Server for MongoDB securely in the repository with Sealed Secrets or Vault Secrets Operator.

Sealed Secrets

Prerequisites:

  • Kubernetes cluster up and running
  • Github repository (optional)

Install Sealed Secrets Controller

Sealed Secrets rely on asymmetric cryptography (which is also used in TLS), where the private key (which in our case is stored in Kubernetes) can decrypt the message encrypted with the public key (which can be stored in public git repository safely). To make this task easier, Sealed Secrets provides the kubeseal tool, which helps with the encryption of the secrets.

Install kubeseal operator into your Kubernetes cluster:

kubectl apply -f https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.15.0/controller.yaml

It will install the controller into the kube-system namespace and provide the Custom Resource Definition

sealedsecrets.bitnami.com

. All resources in Kubernetes with

kind: SealedSecrets

will be handled by this Operator.

Download the kubeseal binary:

wget https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.15.0/kubeseal-linux-amd64 -O kubeseal
sudo install -m 755 kubeseal /usr/local/bin/kubeseal

Encrypt the Keys

In this example, I intend to store important secrets of the Percona Kubernetes Operator for Percona Server for MongoDB in git along with my manifests that are used to deploy the database.

First, I will seal the secret file with system users, which is used by the MongoDB Operator to manage the database. Normally it is stored in deploy/secrets.yaml.

kubeseal --format yaml < secrets.yaml  > blog-data/sealed-secrets/mongod-secrets.yaml

This command creates the file with encrypted contents, you can see it in the blog-data/sealed-secrets repository here. It is safe to store it publicly as it can only be decrypted with a private key.

Executing

kubectl apply -f blog-data/sealed-secrets/mongod-secrets.yaml

does the following:

  1. A sealedsecrets custom resource (CR) is created. You can see it by executing
    kubectl get sealedsecrets

    .

  2. The Sealed Secrets Operator receives the event that a new sealedsecrets CR is there and decrypts it with the private key.
  3. Once decrypted, a regular Secrets object is created which can be used as usual.

$ kubectl get sealedsecrets
NAME               AGE
my-secure-secret   20m

$ kubectl get secrets my-secure-secret
NAME               TYPE     DATA   AGE
my-secure-secret   Opaque   10     20m

Next, I will also seal the keys for my S3 bucket that I plan to use to store backups of my MongoDB database:

kubeseal --format yaml < backup-s3.yaml  > blog-data/sealed-secrets/s3-secrets.yaml
kubectl apply -f blog-data/sealed-secrets/s3-secrets.yaml

Vault Secrets Operator

Sealed Secrets is the simplest approach, but it is possible to achieve the same result with HashiCorp Vault and Vault Secrets Operator. It is a more advanced, mature, and feature-rich approach.

Prerequisites:

Vault Secrets Operator also relies on Custom Resource, but all the keys are stored in HashiCorp Vault:

Preparation

Create a policy on the Vault for the Operator:

cat <<EOF | vault policy write vault-secrets-operator -
path "kvv2/data/*" {
  capabilities = ["read"]
}
EOF

The policy might look a bit differently, depending on where your secrets are.

Create and fetch the token for the policy:

$ vault token create -period=24h -policy=vault-secrets-operator

Key                  Value                                                                                                                                                                                        
---                  -----                                                                                               
token                s.0yJZfCsjFq75GiVyKiZgYVOm
...

Write down the token, as you will need it in the next step.

Create the Kubernetes Secret so that the Operator can authenticate with the Vault:

export VAULT_TOKEN=s.0yJZfCsjFq75GiVyKiZgYVOm
export VAULT_TOKEN_LEASE_DURATION=86400

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: vault-secrets-operator
type: Opaque
data:
  VAULT_TOKEN: $(echo -n "$VAULT_TOKEN" | base64)
  VAULT_TOKEN_LEASE_DURATION: $(echo -n "$VAULT_TOKEN_LEASE_DURATION" | base64)
EOF

Deploy Vault Secrets Operator

It is recommended to deploy the Operator with Helm, but before we need to create the values.yaml file to configure the operator.

environmentVars:
  - name: VAULT_TOKEN
    valueFrom:
      secretKeyRef:
        name: vault-secrets-operator
        key: VAULT_TOKEN
  - name: VAULT_TOKEN_LEASE_DURATION
    valueFrom:
      secretKeyRef:
        name: vault-secrets-operator
        key: VAULT_TOKEN_LEASE_DURATION
vault:
  address: "http://vault.vault.svc:8200"

Environment variables are pointing to the Secret that was created in the previous chapter to authenticate with Vault. We also need to provide the Vault address for the Operator to retrieve the secrets.

Now we can deploy the Vault Secrets Operator:

helm repo add ricoberger https://ricoberger.github.io/helm-charts
helm repo update

helm upgrade --install vault-secrets-operator ricoberger/vault-secrets-operator -f blog-data/sealed-secrets/values.yaml

Give me the Secret

I have a key created in my HashiCorp Vault:

$ vault kv get kvv2/mongod-secret
…
Key                                 Value
---                                 -----                                                                                                                                                                         
MONGODB_BACKUP_PASSWORD             <>
MONGODB_CLUSTER_ADMIN_PASSWORD      <>
MONGODB_CLUSTER_ADMIN_USER          <>
MONGODB_CLUSTER_MONITOR_PASSWORD    <>
MONGODB_CLUSTER_MONITOR_USER        <>                                                                                                                                                               
MONGODB_USER_ADMIN_PASSWORD         <>
MONGODB_USER_ADMIN_USER             <>

It is time to create the secret out of it. First, we will create the Custom Resource object of

kind: VaultSecret

.

$ cat blog-data/sealed-secrets/vs.yaml
apiVersion: ricoberger.de/v1alpha1
kind: VaultSecret
metadata:
  name: my-secure-secret
spec:
  path: kvv2/mongod-secret
  type: Opaque

$ kubectl apply -f blog-data/sealed-secrets/vs.yaml

The Operator will connect to HashiCorp Vault and create regular Secret object automatically:

$ kubectl get vaultsecret
NAME               SUCCEEDED   REASON    MESSAGE              LAST TRANSITION   AGE
my-secure-secret   True        Created   Secret was created   47m               47m

$ kubectl get secret  my-secure-secret
NAME               TYPE     DATA   AGE
my-secure-secret   Opaque   7      47m

Deploy MongoDB Cluster

Now that the secrets are in place, it is time to deploy the Operator and the DB cluster:

kubectl apply -f blog-data/sealed-secrets/bundle.yaml
kubectl apply -f blog-data/sealed-secrets/cr.yaml

The cluster will be up in a minute or two and use secrets we deployed.

By the way, my cr.yaml deploys MongoDB cluster with two shards. Multiple shards support was added in version 1.7.0of the Operator – I encourage you to try it out. Learn more about it here: Percona Server for MongoDB Sharding.

May
06
2020
--

GitHub gets a built-in IDE with Codespaces, discussion forums and more

Under different circumstances, GitHub would be hosting its Satellite conference in Paris this week. Like so many other events, GitHub decided to switch Satellite to a virtual event, but that isn’t stopping the Microsoft-owned company from announcing quite a bit of news this week.

The highlight of GitHub’s announcement is surely the launch of GitHub Codespaces, which gives developers a full cloud-hosted development environment in the cloud, based on Microsoft’s VS Code editor. If that name sounds familiar, that’s likely because Microsoft itself rebranded Visual Studio Code Online to Visual Studio Codespaces a week ago — and GitHub is essentially taking the same concepts and technology and is now integrating it directly inside its service. If you’ve seen VS Online/Codespaces before, the GitHub environment will look very similar.

Contributing code to a community can be hard. Every repository has its own way of configuring a dev environment, which often requires dozens of steps before you can write any code,” writes Shanku Niyogi, GitHub’s SVP of Product, in today’s announcement. “Even worse, sometimes the environment of two projects you are working on conflict with one another. GitHub Codespaces gives you a fully-featured cloud-hosted dev environment that spins up in seconds, directly within GitHub, so you can start contributing to a project right away.”

Currently, GitHub Codespaces is in beta and available for free. The company hasn’t set any pricing for the service once it goes live, but Niyogi says the pricing will look similar to that of GitHub Actions, where it charges for computationally intensive tasks like builds. Microsoft currently charges VS Codespaces users by the hour and depending on the kind of virtual machine they are using.

The other major new feature the company is announcing today is GitHub Discussions. These are essentially discussion forums for a given project. While GitHub already allowed for some degree of conversation around code through issues and pull requests, Discussions are meant to enable unstructured threaded conversations. They also lend themselves to Q&As, and GitHub notes that they can be a good place for maintaining FAQs and other documents.

Currently, Discussions are in beta for open-source communities and will be available for other projects soon.

On the security front, GitHub is also announcing two new features: code scanning and secret scanning. Code scanning checks your code for potential security vulnerabilities. It’s powered by CodeQL and free for open-source projects. Secret scanning is now available for private repositories (a similar feature has been available for public projects since 2018). Both of these features are part of GitHub Advanced Security.

As for GitHub’s enterprise customers, the company today announced the launch of Private Instances, a new fully managed service for enterprise customers that want to use GitHub in the cloud but know that their code is fully isolated from the rest of the company’s users. “Private Instances provides enhanced security, compliance, and policy features including bring-your-own-key encryption, backup archiving, and compliance with regional data sovereignty requirements,” GitHub explains in today’s announcement.

Nov
05
2019
--

ZenHub adds roadmapping to its GitHub project management tool

ZenHub, the popular project management tool that integrates right into GitHub, today announced the launch of Roadmaps. As you can guess from the name, this is a roadmapping feature that allows teams to better plan their projects ahead of time and visualize their status — all from within GitHub.

“We’re diving into a brand new category which is super exciting and we’re really starting to think not only about how forward-thinking software teams are managing their software projects but how they’re actually planning ahead,” ZenHub co-founder Aaron Upright told me. “And we’re really using this as an opportunity to really evolve the product and really introduce now a new kind of entrant into the space for product roadmapping.”

The product itself is indeed pretty straightforward. By default, it takes existing projects and epics a team has already defined and visualizes those on a timeline — including data about how many open issues still remain. In its current iteration, the tool is still pretty basic, but going forward ZenHub will add more advanced features, like blocking. As Upright noted, that’s just fine, though, because while the main goal here is to help teams plans, ZenHub also wants to give other stakeholders a kind of 30,000-foot overview of the state of a project without having to click around every issue in GitHub or Jira.

Upright also argues that existing solutions tend to fall short of what teams really need. “Smaller organizations — teams that are 10, 15 or 25 people — they can’t afford these tools. They’re really expensive. They’re cost-prohibitive,” he said. “And so oftentimes what they do is they turn to Excel files or Google spreadsheets in order to keep track of their roadmap. And keeping the spreadsheets up to date really becomes a complex and really a full-time job.” Yet those tools that are affordable often don’t offer a way to sync data back and forth between GitHub and their platforms, which results in the product team not getting those updates in GitHub, for example. Because ZenHub lives inside of GitHub, that’s obviously not a problem.

ZenHub Roadmaps is now available to all users.

Oct
04
2018
--

GitHub gets a new and improved Jira Software Cloud integration

Atlassian’s Jira has become a standard for managing large software projects in many companies. Many of those same companies also use GitHub as their source code repository and, unsurprisingly, there has long been an official way to integrate the two. That old way, however, was often slow, limited in its capabilities and unable to cope with the large code bases that many enterprises now manage on GitHub .

Almost as if to prove that GitHub remains committed to an open ecosystem, even after the Microsoft acquisition, the company today announced a new and improved integration between the two products.

“Working with Atlassian on the Jira integration was really important for us,” GitHub’s director of ecosystem engineering Kyle Daigle told me ahead of the announcement. “Because we want to make sure that our developer customers are getting the best experience of our open platform that they can have, regardless of what tools they use.”

So a couple of months ago, the team decided to build its own Jira integration from the ground up, and it’s committed to maintaining and improving it over time. As Daigle noted, the improvements here include better performance and a better user experience.

The new integration now also makes it easier to view all the pull requests, commits and branches from GitHub that are associated with a Jira issue, search for issues based on information from GitHub and see the status of the development work right in Jira, too. And because changes in GitHub trigger an update to Jira, too, that data should remain up to date at all times.

The old Jira integration over the so-called Jira DVCS connector will be deprecated and GitHub will start prompting existing users to do the upgrade over the next few weeks. The new integration is now a GitHub app, so that also comes with all of the security features the platform has to offer.

Sep
19
2018
--

GitLab raises $100M

GitLab, the developer service that aims to offer a full lifecycle DevOps platform, today announced that it has raised a $100 million Series D funding round at a valuation of $1.1 billion. The round was led by Iconiq.

As GitLab CEO Sid Sijbrandij told me, this round, which brings the company’s total funding to $145.5 million, will help it enable its goal of reaching an IPO by November 2020.

According to Sijbrandij, GitLab’s original plan was to raise a new funding round at a valuation over $1 billion early next year. But since Iconiq came along with an offer that pretty much matched what the company set out to achieve in a few months anyway, the team decided to go ahead and raise the round now. Unsurprisingly, Microsoft’s acquisition of GitHub earlier this year helped to accelerate those plans, too.

“We weren’t planning on fundraising actually. I did block off some time in my calendar next year, starting from February 25th to do the next fundraise,” Sijbrandij said. “Our plan is to IPO in November of 2020 and we anticipated one more fundraise. I think in the current climate, where the macroeconomics are really good and GitHub got acquired, people are seeing that there’s one independent company, one startup left basically in this space. And we saw an opportunity to become best in class in a lot of categories.”

As Sijbrandij stressed, while most people still look at GitLab as a GitHub and Bitbucket competitor (and given the similarity in their names, who wouldn’t?), GitLab wants to be far more than that. It now offers products in nine categories and also sees itself as competing with the likes of VersionOne, Jira, Jenkins, Artifactory, Electric Cloud, Puppet, New Relic and BlackDuck.

“The biggest misunderstanding we’re seeing is that GitLab is an alternative to GitHub and we’ve grown beyond that,” he said. “We are now in nine categories all the way from planning to monitoring.”

Sijbrandij notes that there’s a billion-dollar player in every space that GitLab competes. “But we want to be better,” he said. “And that’s only possible because we are open core, so people co-create these products with us. That being said, there’s still a lot of work on our side, helping to get those contributions over the finish line, making sure performance and quality stay up, establish a consistent user interface. These are things that typically don’t come from the wider community and with this fundraise of $100 million, we will be able to make sure we can sustain that effort in all the different product categories.”

Given this focus, GitLab will invest most of the funding in its engineering efforts to build out its existing products but also to launch new ones. The company plans to launch new features like tracing and log aggregation, for example.

With this very public commitment to an IPO, GitLab is also signaling that it plans to stay independent. That’s very much Sijbrandij’s plan, at least, though he admitted that “there’s always a price” if somebody came along and wanted to acquire the company. He did note that he likes the transparency that comes with being a public company.

“We always managed to be more bullish about the company than the rest of the world,” he said. “But the rest of the world is starting to catch up. This fundraise is a statement that we now have the money to become a public company where we’re not we’re not interested in being acquired. That is what we’re setting out to do.”

Jul
12
2018
--

GitHub Enterprise and Business Cloud users now get access to public repos, too

GitHub, the code hosting service Microsoft recently acquired, is launching a couple of new features for its business users today that’ll make it easier for them to access public repositories on the service.

Traditionally, users on the hosted Business Cloud and self-hosted Enterprise were not able to directly access the millions of public open-source repositories on the service. Now, with the service’s release, that’s changing, and business users will be able to reach beyond their firewalls to engage and collaborate with the rest of the GitHub community directly.

With this, GitHub now also offers its business and enterprise users a new unified search feature that lets them tap into their internal repos but also look at open-source ones.

Other new features in this latest Enterprise release include the ability to ignore whitespace when reviewing changes, the ability to require multiple reviewers for code changes, automated support tickets and more. You can find a full list of all updates here.

Microsoft’s acquisition of GitHub wasn’t fully unexpected (and it’s worth noting that the acquisition hasn’t closed yet), but it is still controversial, given that Microsoft and the open-source community, which heavily relies on GitHub, haven’t always seen eye-to-eye in the past. I’m personally not too worried about that, and it feels like the dust has settled at this point and that people are waiting to see what Microsoft will do with the service.

May
14
2015
--

MySQL QA Episode 2: Build a MySQL server – Git, Bazaar, Compiling & Build tools

Welcome to MySQL QA Episode 2: Build a MySQL Server – Git, Bazaar (bzr), Compiling, and Build Tools

In this episode you’ll learn how to build Percona Server and/or MySQL Server for QA purposes & more in this short 25 minute tutorial.

In HD quality (set your player to 720p!)

To watch the other episodes in this series, see the MySQL QA & Bash Linux Training Series post. If you missed MySQL QA Episode 1, it was titled “Bash/GNU Tools & Linux Upskill & Scripting Fun.” You are watch it here.

If you have any questions or comments, please leave them below.

The post MySQL QA Episode 2: Build a MySQL server – Git, Bazaar, Compiling & Build tools appeared first on MySQL Performance Blog.

Sep
10
2014
--

With Stash Data Center, Atlassian Brings Git To Large Enterprises

Stash Data Center beta - Multiple Nodes With Stash, Atlassian has been offering a Git-based code-management solution for quite a while now. Up until now the company was mostly going after relatively small teams with this service, but today it is launching Stash Data Center, its Git solution for large enterprises. Unlike the regular Stash service, Stash Data Center can run on a cluster instead of a single server. Thanks to this,… Read More

Sep
25
2013
--

Experimental Git mirror of Oracle MySQL trees

I’ve been working on setting up mirrors on github of all our BZR branches. My first efforts that are at a suitable stage to share are mirrors of the Oracle MySQL trees. This is currently a snapshot of MySQL 5.1, 5.5 and 5.6 with all the tags preserved. I’ve managed to get GIT to compact down the repository to a mere 177MB on disk for all the history, which is rather impressive.

Go check it out: https://github.com/percona/mysql

This should be considered experimental and I may end up pushing up something better at some point soon – i.e. don’t rely on being able to merge later update (think rebase rather than merge).

The post Experimental Git mirror of Oracle MySQL trees appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com