-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add host VM - GPU compatibility checks for GCP #989
Changes from 24 commits
3f50915
4d1b142
8319e23
9adb053
a5f27e2
a3fd8d0
6a4b345
0cd3bf9
e5e5f9d
135fa71
940827c
737cb71
fc23709
b8e5864
c9ad421
02d4159
974ad50
5040460
cc32bcd
102f0c9
b182e76
eaa28ff
b2a19e4
309081d
50bbd91
b90549b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -883,4 +883,10 @@ def _fill_in_launchable_resources( | |
launchable[resources] = _filter_out_blocked_launchable_resources( | ||
launchable[resources], blocked_launchable_resources) | ||
|
||
for r in launchable[resources]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Q: why move it to here, rather than after L849? Was thinking checking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you mean by the "expanded" resources? I thought this check should be applied to every case, as the max cpu and memory limits must be respected to launch an instance on GCP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant that here launchable[resources] may have more than 1 element, - can some of them pass the check, while some fail? In these cases it may make sense to remove the candidates that fail rather than raising an error to the whole program. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. That makes sense. I've rolled back the change. |
||
if isinstance(r.cloud, clouds.GCP): | ||
# Check if the host VM satisfies the max vCPU and memory limits. | ||
clouds.GCP.check_accelerator_attachable_to_host( | ||
r.instance_type, r.accelerators, r.zone) | ||
|
||
return launchable, cloud_candidates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add #989 (comment) to this func and the next func (L207+)? It's great explanation on why these two funcs are structured this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.