-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP: make sky check
enable the 3 required APIs & optional TPU API.
#1197
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,13 +4,17 @@ | |
import hashlib | ||
import os | ||
import pathlib | ||
import re | ||
import subprocess | ||
import sys | ||
import time | ||
import uuid | ||
|
||
import colorama | ||
from cryptography.hazmat.primitives import serialization | ||
from cryptography.hazmat.primitives.asymmetric import rsa | ||
from cryptography.hazmat.backends import default_backend | ||
import googleapiclient | ||
|
||
from sky import clouds | ||
from sky import sky_logging | ||
|
@@ -19,8 +23,7 @@ | |
from sky.utils import subprocess_utils | ||
from sky.utils import ux_utils | ||
|
||
logger = sky_logging.init_logger(__name__) | ||
|
||
colorama.init() | ||
logger = sky_logging.init_logger(__name__) | ||
|
||
# TODO: Should tolerate if gcloud is not installed. Also, | ||
|
@@ -195,7 +198,37 @@ def setup_gcp_authentication(config): | |
credentials=None, | ||
cache_discovery=False) | ||
user = config['auth']['ssh_user'] | ||
project = compute.projects().get(project=project_id).execute() | ||
|
||
try: | ||
project = compute.projects().get(project=project_id).execute() | ||
except googleapiclient.errors.HttpError as e: | ||
# Can happen for a new project where Compute Engine API is disabled. | ||
# Ex: | ||
# 'Compute Engine API has not been used in project 123456 before | ||
# or it is disabled. Enable it by visiting | ||
# https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=123456 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can use
without There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an example message meant for reminding us developers. Changed the comment a bit to be clearer. |
||
# then retry. If you enabled this API recently, wait a few minutes for | ||
# the action to propagate to our systems and retry.' | ||
if ' API has not been used in project' in e.reason: | ||
match = re.fullmatch(r'(.+)(https://.*project=\d+) (.+)', e.reason) | ||
if match is None: | ||
raise # This should not happen. | ||
yellow = colorama.Fore.YELLOW | ||
reset = colorama.Style.RESET_ALL | ||
bright = colorama.Style.BRIGHT | ||
dim = colorama.Style.DIM | ||
logger.error( | ||
f'{yellow}Certain GCP APIs are disabled for the GCP project ' | ||
f'{project_id}.{reset}') | ||
logger.error(f'{yellow}Enable them by running:{reset} ' | ||
f'{bright}sky check{reset}') | ||
logger.error('Details:') | ||
logger.error(f'{dim}{match.group(1)}{reset}\n' | ||
f'{dim} {match.group(2)}{reset}\n' | ||
f'{dim}{match.group(3)}{reset}') | ||
sys.exit(1) | ||
else: | ||
raise | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to raise Runtime error with more details printed if different error happens? also to catch possible message pattern change in the future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note (1) this statement will already print details of the exception, and (2) there's a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah now I see. thanks for clarification. |
||
|
||
project_oslogin = next( | ||
(item for item in project['commonInstanceMetadata'].get('items', []) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ | |
import json | ||
import os | ||
import subprocess | ||
import time | ||
import typing | ||
from typing import Dict, Iterator, List, Optional, Tuple | ||
|
||
|
@@ -61,6 +62,16 @@ def _run_output(cmd): | |
return proc.stdout.decode('ascii') | ||
|
||
|
||
def is_api_disabled(endpoint: str, project_id: str) -> bool: | ||
proc = subprocess.run((f'gcloud services list --project {project_id} ' | ||
f' | grep {endpoint}.googleapis.com'), | ||
check=False, | ||
shell=True, | ||
stderr=subprocess.PIPE, | ||
stdout=subprocess.PIPE) | ||
return proc.returncode != 0 | ||
|
||
|
||
@clouds.CLOUD_REGISTRY.register | ||
class GCP(clouds.Cloud): | ||
"""Google Cloud Platform.""" | ||
|
@@ -347,6 +358,46 @@ def check_credentials(self) -> Tuple[bool, Optional[str]]: | |
'For more info: ' | ||
'https://skypilot.readthedocs.io/en/latest/getting-started/installation.html' # pylint: disable=line-too-long | ||
) | ||
|
||
# Check APIs. | ||
project_id = self.get_project_id() | ||
apis = ( | ||
('compute', 'Compute Engine'), | ||
('cloudresourcemanager', 'Cloud Resource Manager'), | ||
('iam', 'Identity and Access Management (IAM)'), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we also enable TPU API for users? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great catch. I verified that a new project doesn't have tpu enabled. After
Then, manually running
we see this prompt is why it's stuck. Changed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah thanks for verifying this! |
||
) | ||
enabled_api = False | ||
for endpoint, display_name in apis: | ||
if is_api_disabled(endpoint, project_id): | ||
# For 'compute': ~55-60 seconds for the first run. If already | ||
# enabled, ~1s. Other API endpoints take ~1-5s to enable. | ||
if endpoint == 'compute': | ||
suffix = ' (may take a minute)' | ||
else: | ||
suffix = '' | ||
print(f'\nEnabling {display_name} API{suffix}...') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we say this doesn't incur any cost? as last time a Stanford user was concerning about it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good call, done. |
||
t1 = time.time() | ||
proc = subprocess.run( | ||
f'gcloud services enable {endpoint}.googleapis.com ' | ||
f'--project {project_id}', | ||
check=False, | ||
shell=True, | ||
stdout=subprocess.PIPE, | ||
stderr=subprocess.STDOUT) | ||
if proc.returncode == 0: | ||
enabled_api = True | ||
print(f'Done. Took {time.time() - t1:.1f} secs.') | ||
else: | ||
print('Failed; please manually enable the API. Log:') | ||
print(proc.stdout.decode()) | ||
return False, ( | ||
f'{display_name} API is disabled. Please retry ' | ||
'`sky check` in a few minutes, or manually enable it.') | ||
if enabled_api: | ||
print('\nHint: Enabled GCP API(s) may take a few minutes to take ' | ||
'effect. If any SkyPilot commands/calls failed, retry after ' | ||
'some time.') | ||
|
||
return True, None | ||
|
||
def get_credential_file_mounts(self) -> Dict[str, str]: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to add another
except
at the end to catch other unexpected errors?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should catch blindly. We should let those unexpected errors surface so that we know what's up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure that makes sense. was thinking about printing some additional msgs but I don't think it's necessary as users can't easily fix this themselves.