llm
gpu_energy(model_active_parameter_count, output_token_count, gpu_energy_alpha, gpu_energy_beta)
Compute energy consumption of a single GPU.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
gpu_energy_alpha |
float
|
Alpha parameter of the GPU linear power consumption profile. |
required |
gpu_energy_beta |
float
|
Beta parameter of the GPU linear power consumption profile. |
required |
Returns:
Type | Description |
---|---|
float
|
The energy consumption of a single GPU in kWh. |
Source code in ecologits/impacts/llm.py
generation_latency(model_active_parameter_count, output_token_count, gpu_latency_alpha, gpu_latency_beta, request_latency)
Compute the token generation latency in seconds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
gpu_latency_alpha |
float
|
Alpha parameter of the GPU linear latency profile. |
required |
gpu_latency_beta |
float
|
Beta parameter of the GPU linear latency profile. |
required |
request_latency |
float
|
Measured request latency (upper bound) in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The token generation latency in seconds. |
Source code in ecologits/impacts/llm.py
model_required_memory(model_total_parameter_count, model_quantization_bits)
Compute the required memory to load the model on GPU.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_total_parameter_count |
float
|
Number of parameters of the model. |
required |
model_quantization_bits |
int
|
Number of bits used to represent the model weights. |
required |
Returns:
Type | Description |
---|---|
float
|
The amount of required GPU memory to load the model. |
Source code in ecologits/impacts/llm.py
gpu_required_count(model_required_memory, gpu_memory)
Compute the number of required GPU to store the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_required_memory |
float
|
Required memory to load the model on GPU. |
required |
gpu_memory |
float
|
Amount of memory available on a single GPU. |
required |
Returns:
Type | Description |
---|---|
int
|
The number of required GPUs to load the model. |
Source code in ecologits/impacts/llm.py
server_energy(generation_latency, server_power, server_gpu_count, gpu_required_count)
Compute the energy consumption of the server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generation_latency |
float
|
Token generation latency in seconds. |
required |
server_power |
float
|
Power consumption of the server in kW. |
required |
server_gpu_count |
int
|
Number of available GPUs in the server. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The energy consumption of the server (GPUs are not included) in kWh. |
Source code in ecologits/impacts/llm.py
request_energy(datacenter_pue, server_energy, gpu_required_count, gpu_energy)
Compute the energy consumption of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datacenter_pue |
float
|
PUE of the datacenter. |
required |
server_energy |
float
|
Energy consumption of the server in kWh. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
gpu_energy |
float
|
Energy consumption of a single GPU in kWh. |
required |
Returns:
Type | Description |
---|---|
float
|
The energy consumption of the request in kWh. |
Source code in ecologits/impacts/llm.py
request_usage_gwp(request_energy, if_electricity_mix_gwp)
Compute the Global Warming Potential (GWP) usage impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
request_energy |
float
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption in kgCO2eq / kWh. |
required |
Returns:
Type | Description |
---|---|
float
|
The GWP usage impact of the request in kgCO2eq. |
Source code in ecologits/impacts/llm.py
request_usage_adpe(request_energy, if_electricity_mix_adpe)
Compute the Abiotic Depletion Potential for Elements (ADPe) usage impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
request_energy |
float
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption in kgSbeq / kWh. |
required |
Returns:
Type | Description |
---|---|
float
|
The ADPe usage impact of the request in kgSbeq. |
Source code in ecologits/impacts/llm.py
request_usage_pe(request_energy, if_electricity_mix_pe)
Compute the Primary Energy (PE) usage impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
request_energy |
float
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption in MJ / kWh. |
required |
Returns:
Type | Description |
---|---|
float
|
The PE usage impact of the request in MJ. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_gwp(server_embodied_gwp, server_gpu_count, gpu_embodied_gwp, gpu_required_count)
Compute the Global Warming Potential (GWP) embodied impact of the server
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_embodied_gwp |
float
|
GWP embodied impact of the server in kgCO2eq. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_gwp |
float
|
GWP embodied impact of a single GPU in kgCO2eq. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The GWP embodied impact of the server and the GPUs in kgCO2eq. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_adpe(server_embodied_adpe, server_gpu_count, gpu_embodied_adpe, gpu_required_count)
Compute the Abiotic Depletion Potential for Elements (ADPe) embodied impact of the server
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_embodied_adpe |
float
|
ADPe embodied impact of the server in kgSbeq. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_adpe |
float
|
ADPe embodied impact of a single GPU in kgSbeq. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The ADPe embodied impact of the server and the GPUs in kgSbeq. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_pe(server_embodied_pe, server_gpu_count, gpu_embodied_pe, gpu_required_count)
Compute the Primary Energy (PE) embodied impact of the server
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_embodied_pe |
float
|
PE embodied impact of the server in MJ. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_pe |
float
|
PE embodied impact of a single GPU in MJ. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The PE embodied impact of the server and the GPUs in MJ. |
Source code in ecologits/impacts/llm.py
request_embodied_gwp(server_gpu_embodied_gwp, server_lifetime, generation_latency)
Compute the Global Warming Potential (GWP) embodied impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_gpu_embodied_gwp |
float
|
GWP embodied impact of the server and the GPUs in kgCO2eq. |
required |
server_lifetime |
float
|
Lifetime duration of the server in seconds. |
required |
generation_latency |
float
|
Token generation latency in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The GWP embodied impact of the request in kgCO2eq. |
Source code in ecologits/impacts/llm.py
request_embodied_adpe(server_gpu_embodied_adpe, server_lifetime, generation_latency)
Compute the Abiotic Depletion Potential for Elements (ADPe) embodied impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_gpu_embodied_adpe |
float
|
ADPe embodied impact of the server and the GPUs in kgSbeq. |
required |
server_lifetime |
float
|
Lifetime duration of the server in seconds. |
required |
generation_latency |
float
|
Token generation latency in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The ADPe embodied impact of the request in kgSbeq. |
Source code in ecologits/impacts/llm.py
request_embodied_pe(server_gpu_embodied_pe, server_lifetime, generation_latency)
Compute the Primary Energy (PE) embodied impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_gpu_embodied_pe |
float
|
PE embodied impact of the server and the GPUs in MJ. |
required |
server_lifetime |
float
|
Lifetime duration of the server in seconds. |
required |
generation_latency |
float
|
Token generation latency in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The PE embodied impact of the request in MJ. |
Source code in ecologits/impacts/llm.py
compute_llm_impacts_dag(model_active_parameter_count, model_total_parameter_count, output_token_count, request_latency, if_electricity_mix_adpe, if_electricity_mix_pe, if_electricity_mix_gwp, model_quantization_bits=MODEL_QUANTIZATION_BITS, gpu_energy_alpha=GPU_ENERGY_ALPHA, gpu_energy_beta=GPU_ENERGY_BETA, gpu_latency_alpha=GPU_LATENCY_ALPHA, gpu_latency_beta=GPU_LATENCY_BETA, gpu_memory=GPU_MEMORY, gpu_embodied_gwp=GPU_EMBODIED_IMPACT_GWP, gpu_embodied_adpe=GPU_EMBODIED_IMPACT_ADPE, gpu_embodied_pe=GPU_EMBODIED_IMPACT_PE, server_gpu_count=SERVER_GPUS, server_power=SERVER_POWER, server_embodied_gwp=SERVER_EMBODIED_IMPACT_GWP, server_embodied_adpe=SERVER_EMBODIED_IMPACT_ADPE, server_embodied_pe=SERVER_EMBODIED_IMPACT_PE, server_lifetime=HARDWARE_LIFESPAN, datacenter_pue=DATACENTER_PUE)
Compute the impacts dag of an LLM generation request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model. |
required |
model_total_parameter_count |
float
|
Number of parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
request_latency |
float
|
Measured request latency in seconds. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption of kgSbeq / kWh (Antimony). |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption in MJ / kWh. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption in kgCO2eq / kWh. |
required |
model_quantization_bits |
Optional[int]
|
Number of bits used to represent the model weights. |
MODEL_QUANTIZATION_BITS
|
gpu_energy_alpha |
Optional[float]
|
Alpha parameter of the GPU linear power consumption profile. |
GPU_ENERGY_ALPHA
|
gpu_energy_beta |
Optional[float]
|
Beta parameter of the GPU linear power consumption profile. |
GPU_ENERGY_BETA
|
gpu_latency_alpha |
Optional[float]
|
Alpha parameter of the GPU linear latency profile. |
GPU_LATENCY_ALPHA
|
gpu_latency_beta |
Optional[float]
|
Beta parameter of the GPU linear latency profile. |
GPU_LATENCY_BETA
|
gpu_memory |
Optional[float]
|
Amount of memory available on a single GPU. |
GPU_MEMORY
|
gpu_embodied_gwp |
Optional[float]
|
GWP embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_GWP
|
gpu_embodied_adpe |
Optional[float]
|
ADPe embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_ADPE
|
gpu_embodied_pe |
Optional[float]
|
PE embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_PE
|
server_gpu_count |
Optional[int]
|
Number of available GPUs in the server. |
SERVER_GPUS
|
server_power |
Optional[float]
|
Power consumption of the server in kW. |
SERVER_POWER
|
server_embodied_gwp |
Optional[float]
|
GWP embodied impact of the server in kgCO2eq. |
SERVER_EMBODIED_IMPACT_GWP
|
server_embodied_adpe |
Optional[float]
|
ADPe embodied impact of the server in kgSbeq. |
SERVER_EMBODIED_IMPACT_ADPE
|
server_embodied_pe |
Optional[float]
|
PE embodied impact of the server in MJ. |
SERVER_EMBODIED_IMPACT_PE
|
server_lifetime |
Optional[float]
|
Lifetime duration of the server in seconds. |
HARDWARE_LIFESPAN
|
datacenter_pue |
Optional[float]
|
PUE of the datacenter. |
DATACENTER_PUE
|
Returns:
Type | Description |
---|---|
dict[str, float]
|
The impacts dag with all intermediate states. |
Source code in ecologits/impacts/llm.py
341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 |
|
compute_llm_impacts(model_active_parameter_count, model_total_parameter_count, output_token_count, if_electricity_mix_adpe, if_electricity_mix_pe, if_electricity_mix_gwp, request_latency=None, **kwargs)
Compute the impacts of an LLM generation request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
ValueOrRange
|
Number of active parameters of the model. |
required |
model_total_parameter_count |
ValueOrRange
|
Number of total parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption of kgSbeq / kWh (Antimony). |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption in MJ / kWh. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption in kgCO2eq / kWh. |
required |
request_latency |
Optional[float]
|
Measured request latency in seconds. |
None
|
**kwargs |
Any
|
Any other optional parameter. |
{}
|
Returns:
Type | Description |
---|---|
Impacts
|
The impacts of an LLM generation request. |
Source code in ecologits/impacts/llm.py
425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 |
|